• Corpus ID: 247084494

Simultaneous Missing Value Imputation and Structure Learning with Groups

  title={Simultaneous Missing Value Imputation and Structure Learning with Groups},
  author={Pablo Morales-{\'A}lvarez and Wenbo Gong and A. Lamb and Simon Woodhead and Simon L. Peyton Jones and Nick Pawlowski and Miltiadis Allamanis and Cheng Zhang},
Learning structures between groups of variables from data with missing values is an important task in the real world, yet difficult to solve. One typical scenario is discovering the structure among topics in the education domain to identify learning pathways. Here, the observations are student performances for questions under each topic which contain missing values. However, most existing methods focus on learning structures between a few individual variables from the complete data. In this… 



Handling Incomplete Heterogeneous Data using VAEs

Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing

It is shown that factorization machines (FMs), a model for regression or classification, encompasses several existing models in the educational literature as special cases, notably additive factor model, performance factors, and multidimensional item response theory.

DAG-GNN: DAG Structure Learning with Graph Neural Networks

A deep generative model is proposed and a variant of the structural constraint to learn the DAG is applied that learns more accurate graphs for nonlinearly generated samples; and on benchmark data sets with discrete variables, the learned graphs are reasonably close to the global optima.

Concave penalized estimation of sparse Gaussian Bayesian networks

This work develops a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data using concave regularization and provides theoretical guarantees which generalize existing asymptotic results when the underlying distribution is Gaussian.

Learning Sparse Nonparametric DAGs

A completely general framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data is developed that can be applied to general nonlinear models, general differentiable loss functions, and generic black-box optimization routines.

Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding

This paper proposes an idea the authors term cross-coding to approximate the distribution over the latent variables after conditioning on an evidence assignment to some subset of the variables, which allows generating query samples without retraining the full VAE.

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

MICE: Multivariate Imputation by Chained Equations in R

Mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs.

Penalized estimation of directed acyclic graphs from discrete data

A maximum penalized likelihood method to tackle Bayesian networks from discrete or categorical data, which model the conditional distribution of a node given its parents by multi-logit regression instead of the commonly used multinomial distribution.

Fast causal inference with non-random missingness by test-wise deletion

The theoretical results show that test-wise deletion is sound under the justifiable assumption that none of the missingness mechanisms causally affect each other in the underlying causal graph, and it is found that FCI and RFCI with test- Wise deletion outperform their list-wise delete and imputation counterparts on average when MNAR holds in both synthetic and real data.