• Corpus ID: 208176134

Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks

  title={Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks},
  author={Joonyoung Yi and Juhyuk Lee and Kwang Joon Kim and Sung Ju Hwang and Eunho Yang},
Handling missing data is one of the most fundamental problems in machine learning. Among many approaches, the simplest and most intuitive way is zero imputation, which treats the value of a missing entry simply as zero. However, many studies have experimentally confirmed that zero imputation results in suboptimal performances in training neural networks. Yet, none of the existing work has explained what brings such performance degradations. In this paper, we introduce the variable sparsity… 
How to deal with missing data in supervised deep learning
A deep latent variable model can be learned jointly with the discriminative model, using importance-weighted variational inference in an end-to-end way, and this hybrid approach, which mimics multiple imputation, also allows to impute the data, by relying on both the discriminating and generative model.
Generative Imputation and Stochastic Prediction
The experimental results show the effectiveness of the proposed method in generating imputations as well as providing estimates for the class uncertainties in a classification task when faced with missing values.
MAIN: Multihead-Attention Imputation Networks
This work proposes a novel mechanism based on multi-head attention which can be applied effortlessly in any model and achieves better downstream performance without the introduction of the full dataset in any part of the modeling pipeline.
Debiasing Averaged Stochastic Gradient Descent to handle missing values
In both streaming and finite-sample settings, it is proved that this averaged stochastic gradient algorithm handling missing values in linear models achieves convergence rate of O( 1 n ) at the iteration n, the same as without missing values.
FedNI: Federated Graph Learning with Network Inpainting for Population-Based Disease Prediction
This work proposes a framework, FedNI, to leverage network inpainting and inter-institutional data via FL and first federatively train missing node and edge predictor using a graph generative adversarial network (GAN) to complete the missing information of local networks.
A Random Matrix Analysis of Learning with α-Dropout
This article studies a one hidden layer neural network with generalized Dropout (α-Dropout), where the dropped out features are replaced with an arbitrary value α. Specifically, under a large
Probabilistic personalised cascade with abstention
An efficient and robust approach based on a Probabilistic graphical model representing a unified probabilistic classifier that can be applied at any stage of a multi-stage sequential model is introduced.
Collaborative Reflection-Augmented Autoencoder Network for Recommender Systems
  • Lianghao Xia, Chao Huang, Yong Xu, Huance Xu, Xiang Li, Weiguo Zhang
  • Computer Science
    ACM Transactions on Information Systems
  • 2022
As the deep learning techniques have expanded to real-world recommendation tasks, many deep neural network based Collaborative Filtering (CF) models have been developed to project user-item
Graph Convolutional Networks for Graphs Containing Missing Features
This approach integrates the processing of missing features and graph learning within the same neural network architecture and demonstrates through extensive experiments that this approach significantly outperforms the imputation based methods in node classification and link prediction tasks.


GAIN: Missing Data Imputation using Generative Adversarial Nets
This work proposes a novel method for imputing missing data by adapting the well-known Generative Adversarial Nets (GAN) framework and calls it GAIN, which significantly outperforms state-of-the-art imputation methods.
MIDA: Multiple Imputation Using Denoising Autoencoders
Evaluation on several real life datasets show the proposed multiple imputation model based on overcomplete deep denoising autoencoders significantly outperforms current state-of-the-art methods under varying conditions while simultaneously improving end of the line analytics.
Multivariate Time Series Imputation with Generative Adversarial Networks
Experiments show that the proposed model outperformed the baselines in terms of accuracy of imputation, and a simple model on the imputed data can achieve state-of-the-art results on the prediction tasks, demonstrating the benefits of the model in downstream applications.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Representation learning via Dual-Autoencoder for recommendation
A new representation learning framework called Recommendation via Dual-Autoencoder (ReDa) is proposed, which simultaneously learns the new hidden representations of users and items using autoencoders, and develops a gradient descent method to learn hidden representations.
Dropout: a simple way to prevent neural networks from overfitting
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Understanding the difficulty of training deep feedforward neural networks
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
MADE: Masked Autoencoder for Distribution Estimation
This work introduces a simple modification for autoencoder neural networks that yields powerful generative models and proves that this approach is competitive with state-of-the-art tractable distribution estimators.
BRITS: Bidirectional Recurrent Imputation for Time Series
BRITS is a novel method based on recurrent neural networks for missing value imputation in time series data that directly learns the missing values in a bidirectional recurrent dynamical system, without any specific assumption.
MisGAN: Learning from Incomplete Data with Generative Adversarial Networks
The proposed GAN-based framework learns a complete data generator along with a mask generator that models the missing data distribution and demonstrates how to impute missing data by equipping the framework with an adversarially trained imputer.