• Corpus ID: 238407713

Gradient Importance Learning for Incomplete Observations

  title={Gradient Importance Learning for Incomplete Observations},
  author={Qitong Gao and Dong Wang and Joshua D. Amason and Siyang Yuan and Chenyang Tao and Ricardo Henao and Majda Hadziahmetovic and Lawrence Carin and Miroslav Pajic},
  booktitle={International Conference on Learning Representations},
Though recent works have developed methods that can generate estimates (or imputations) of the missing entries in a dataset to facilitate downstream analysis, most depend on assumptions that may not align with real-world applications and could suffer from poor performance in subsequent tasks such as classification. This is particularly true if the data have large missingness rates or a small sample size. More importantly, the imputation error could be propagated into the prediction step that… 

Figures and Tables from this paper

Variational Latent Branching Model for Off-Policy Evaluation

The variational latent branching model (VLBM) is proposed to learn the transition function of MDPs by formulating the environmental dynamics as a compact latent space, from which the next states and rewards are then sampled.

A Reinforcement Learning-Informed Pattern Mining Framework for Multivariate Time Series Classification

This work proposes a reinforcement learning (RL) informed PAttern Mining framework (RLPAM) to identify interpretable yet important patterns for MTS classification and shows how RL informed patterns can be interpretable and can improve the understanding of septic shock progression.

Reconstructing Missing EHRs Using Time-Aware Within- and Cross-Visit Information for Septic Shock Early Prediction

A Time-Aware Dual-Cross-Visit missing value imputation method, named TA-DualCV, which spontaneously leverages multivariate dependencies across features and longitudinal dependencies both within- and cross-visit to maximize the information extracted from limited observable records in EHRs.



Machine learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test accuracy

It is shown that on retrospective data, individual machine learning models can accurately predict sepsis onset ahead of time and between-study heterogeneity limits the assessment of pooled results.

GP-VAE: Deep Probabilistic Time Series Imputation

This work proposes a new deep sequential latent variable model for dimensionality reduction and data imputation of multivariate time series from the domains of computer vision and healthcare, and demonstrates that this approach outperforms several classical and deep learning-based data imputations methods on high-dimensional data.

GAIN: Missing Data Imputation using Generative Adversarial Nets

This work proposes a novel method for imputing missing data by adapting the well-known Generative Adversarial Nets (GAN) framework and calls it GAIN, which significantly outperforms state-of-the-art imputation methods.

Statistical analysis with missing data, volume 793

  • 2019

MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets

This work presents a simple technique to train DLVMs when the training set contains missing-at-random data, and develops Monte Carlo techniques for single and multiple imputation using a DLVM trained on an incomplete data set.

Multiple imputation by chained equations: what is it and how does it work?

This paper provides an introduction to the MICE method with a focus on practical aspects and challenges in using this method.

MIMIC-III, a freely accessible critical care database

MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care

Continuous control with deep reinforcement learning

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

Recurrent Models of Visual Attention

A novel recurrent neural network model that is capable of extracting information from an image or video by adaptively selecting a sequence of regions or locations and only processing the selected regions at high resolution is presented.

A Data-Driven Approach to Predicting Septic Shock in the Intensive Care Unit

A data-driven, expert knowledge agnostic method is used to build a screening algorithm for early detection of septic shock and demonstrates strong performance in the data set used and provides a basis for expanding this work toward building an algorithm that is used for large-scale automated screening to identify high-risk patients based on electronic medical record data in real time.