• Corpus ID: 226222273

Handling Missing Data with Graph Representation Learning

@article{You2020HandlingMD,
  title={Handling Missing Data with Graph Representation Learning},
  author={Jiaxuan You and Xiaobai Ma and Daisy Yi Ding and Mykel J. Kochenderfer and Jure Leskovec},
  journal={ArXiv},
  year={2020},
  volume={abs/2010.16418}
}
Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose… 

Figures and Tables from this paper

Wasserstein Graph Neural Networks for Graphs with Missing Attributes
TLDR
This paper proposes an innovative node representation learning framework, Wasserstein Graph Neural Network (WGNN), that expresses nodes as lowdimensional distributions derived from the decomposition of the attribute matrix, and strengthens the expressiveness of representations by developing a novel message passing schema.
TSI-GNN: Extending Graph Neural Networks to Handle Missing Data in Temporal Settings
TLDR
It is shown that incorporating temporal information into a bipartite graph improves the representation at the 30% and 60% missing rate, specifically when using a nonlinear model for downstream prediction tasks in regularly sampled datasets and is competitive with existing temporal methods under different scenarios.
Accurate Node Feature Estimation with Structured Variational Graph Autoencoder
TLDR
This work proposes SVGA (Structured Variational Graph Autoencoder), an accurate method for feature estimation that combines the advantages of probabilistic inference and graph neural networks, achieving state-of-the-art performance in real datasets.
Wasserstein diffusion on graphs with missing attributes
TLDR
This paper extends the message passing schema in general graph neural networks to a Wasserstein space derived from the decomposition of attribute matrices, and finds WGD is suitable to recover missing values and adapt it to tackle matrix completion problems with graphs of users and items.
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach
TLDR
This work proposes a new learning paradigm with graph representation and learning for open-world feature extrapolation problem where the feature space of input data goes through expansion and a model trained on partially observed features needs to handle new features in test data without further retraining.
Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks
TLDR
A novel graph neural network architecture is introduced, named GRIN, which aims at reconstructing missing data in the different channels of a multivariate time series by learning spatio-temporal representations through message passing and outperforms state-of-the-art methods in the imputation task on relevant real-world benchmarks.
Semi-supervised Learning with Missing Values Imputation
TLDR
A novel semi-supervised conditional normalizing flow (SSCFlow) is proposed, which treats the initialized missing values as corrupted initial imputation and iteratively reconstructs their latent representations with an overcomplete denoising autoencoder to approximate their true conditional distribution.
Multivariate Time Series Imputation by Graph Neural Networks
TLDR
A novel graph neural network architecture is introduced, named GRIL, which aims at reconstructing missing data in the different channels of a multivariate time series by learning spatial-temporal representations through message passing and preliminary empirical results show that this model outperforms state-of-the-art methods in the imputation task on relevant benchmarks.
Semi-supervised Conditional Density Estimation for Imputation and Classification of Incomplete Instances
TLDR
A novel semi-supervised conditional normalizing flow (SSCFlow) is proposed in this paper, which takes the initialized missing values as corrupted initial imputation and iteratively reconstructs their latent representations with an overcomplete denoising autoencoder to approximate the true conditional probability density of missing values.
Siamese Attribute-missing Graph Auto-encoder
TLDR
This paper proposes the Siamese Attribute-missing Graph Auto-encoder (SAGA) and introduces a siamese network structure to share the parameters learned by both processes, which allows the network training to benefit from more abundant and diverse information.
...
...

References

SHOWING 1-10 OF 66 REFERENCES
Missing Data Imputation with Adversarially-trained Graph Convolutional Networks
Inductive Representation Learning on Large Graphs
TLDR
GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.
Handling Missing Data in Trees: Surrogate Splits or Statistical Imputation
TLDR
Simulation-based data augmentation to handleMissing data, which is based on filling-in (imputing) one or more plausible values for the missing data, is investigated, showing that imputation tends to outperform surrogate splits in terms of predictive accuracy of the resulting models.
MissForest - non-parametric missing value imputation for mixed-type data
TLDR
In this comparative study, missForest outperforms other methods of imputation especially in data settings where complex interactions and non-linear relations are suspected and the out-of-bag imputation error estimates of missForest prove to be adequate in all settings.
Inductive Matrix Completion Based on Graph Neural Networks
TLDR
It is possible to train inductive matrix completion models without using side information while achieving similar or better performances than state-of-the-art transductive methods; local graph patterns around a (user, item) pair are effective predictors of the rating this user gives to the item; and long-range dependencies might not be necessary for modeling recommender systems.
Position-aware Graph Neural Networks
TLDR
Position-aware Graph Neural Networks (P-GNNs) are proposed, a new class of GNNs for computing position-aware node embeddings that are inductive, scalable, and can incorporate node feature information.
Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks
TLDR
This paper proposes a novel approach to overcome limitations of matrix completion techniques by using geometric deep learning on graphs, and applies this method on both synthetic and real datasets, showing that it outperforms state-of-the-art techniques.
Graph Convolutional Neural Networks for Web-Scale Recommender Systems
TLDR
A novel method based on highly efficient random walks to structure the convolutions and a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model are developed.
Adjusted weight voting algorithm for random forests in handling missing values
DropEdge: Towards Deep Graph Convolutional Networks on Node Classification
TLDR
DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance and consistently improves the performance on a variety of both shallow and deep GCNs.
...
...