# Causal discovery in the presence of missing data

@article{Tu2019CausalDI, title={Causal discovery in the presence of missing data}, author={Ruibo Tu and Cheng Zhang and Paul W. Ackermann and Hedvig Kjellstr{\"o}m and Kun Zhang}, journal={ArXiv}, year={2019}, volume={abs/1807.04010} }

Missing data are ubiquitous in many domains such as healthcare. When these data entries are not missing completely at random, the (conditional) independence relations in the observed data may be di ...

## 28 Citations

### Causal Discovery in the Presence of Missing Values for Neuropathic Pain Diagnosis

- Computer Science
- 2020

The constraint-based causal discovery method PC is extended to handle binary data sets with missing values for the neuropathic pain diagnosis and identifies the potential errors of simply applying PC to data setsWith missing values.

### MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

- Computer Science
- 2022

MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization (EM) framework and is demonstrated the exibility of MissDAG for incorporating various causal discovery algorithms and its e cacy through extensive simulations and real data experiments.

### Full Law Identification In Graphical Models Of Missing Data: Completeness Results

- MathematicsICML
- 2020

This paper provides the first completeness result in this field of study - necessary and sufficient graphical conditions under which, the full data distribution can be recovered from the observed data distribution.

### A practical guide to causal discovery with cohort data

- Computer Science
- 2021

This guide presents how to perform constraint-based causal discovery using three popular software packages: pcalg, bnlearn, and TETRAD, and points out the relative strengths and limitations of each package, as well as give practical recommendations.

### Greedy structure learning from data that contains systematic missing values

- Computer ScienceMachine Learning
- 2022

The empirical investigations show that the proposed approach outperforms the commonly used and state-of-the-art Structural EM algorithm, both in terms of learning accuracy and efficiency, as well as when data are missing at random and not at random.

### Multiple imputation and test‐wise deletion for causal discovery with incomplete cohort data

- Computer ScienceStatistics in medicine
- 2022

This article establishes necessary and sufficient conditions for the recoverability of causal structures under test‐wise deletion, and argues that multiple imputation is more challenging in the context of causal discovery than for estimation.

### MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms

- Computer ScienceNeurIPS
- 2021

This work develops a regularization scheme that encourages any baseline imputation method to be causally consistent with the underlying data generating mechanism, and proposes a causally-aware imputation algorithm, MIRACLE, that is able to consistently improve imputation over a variety of benchmark methods.

### Star-causality and factor analysis: old stories and new perspectives

- BusinessApplied Informatics
- 2017

In this paper, studies on conditional independence-based causality are briefly reviewed along a line of observable two-variable, three- variable, star decomposable, and tree decomposables, as well as their relationship to factor analysis.

### Causal discovery of gene regulation with incomplete data

- Computer ScienceJournal of the Royal Statistical Society: Series A (Statistics in Society)
- 2020

This work applied causal discovery to obtain novel insights into the genetic regulation underlying head‐and‐neck squamous cell carcinoma, and proposed a new procedure combining constraint‐based causal discovery with multiple imputation based on using Rubin's rules for pooling tests of conditional independence.

### On Testability and Goodness of Fit Tests in Missing Data Models

- Computer ScienceArXiv
- 2022

New insights are provided on the testable implications of three broad classes of missing data graphical models, and how to design goodness-of-fit tests around them.

## References

SHOWING 1-10 OF 71 REFERENCES

### Graphical Models for Inference with Missing Data

- Computer Science
- 2014

This work employs a formal representation called ‘Missingness Graphs’ to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured.

### Missing Data as a Causal and Probabilistic Problem

- Computer Science, MathematicsUAI
- 2015

This paper extends the converse approach of [7] of representing missing data problems to causal models where only interventions onMissingness indicators are allowed to give a general criterion for cases where a joint distribution containing missing variables can be recovered from data actually observed, given assumptions on missingness mechanisms.

### Identification In Missing Data Models Represented By Directed Acyclic Graphs

- Computer Science, MathematicsUAI
- 2019

This paper proposes a new algorithm that significantly generalizes the types of manipulations used in the ID algorithm, developed in the context of causal inference, in order to obtain identification.

### Handling hybrid and missing data in constraint-based causal discovery to study the etiology of ADHD

- Computer ScienceInternational Journal of Data Science and Analytics
- 2016

A new method is developed based on the assumption that data are missing at random and that continuous variables obey a non-paranormal distribution that helps in the understanding of the etiology of attention-deficit/hyperactivity disorder (ADHD).

### INFERENCE AND MISSING DATA

- Geology
- 1975

Two results are presented concerning inference when data may be missing. First, ignoring the process that causes missing data when making sampling distribution inferences about the parameter of the…

### Structure Learning Under Missing Data

- Computer SciencePGM
- 2018

This paper discusses adjustments that must be made to existing structure learning algorithms to properly account for missing data, and gives an algorithm for the simpler setting where the underlying graph is unknown, but the missing data model is known.

### Estimation with Incomplete Data: The Linear Case

- Computer Science, MathematicsIJCAI
- 2018

This work devise model-based methods to consistently estimate mean, variance and covariance given data that are Missing Not At Random (MNAR), and extends the analysis to continuous variables drawn from Gaussian distributions.

### On the Testability of Models with Missing Data

- MathematicsAISTATS
- 2014

This work uses the results to show that model sensitivity persists in almost all models typically categorized as MNAR, and provides sucient conditions to detect the existence of dependence between a variable and its missingness mechanism.

### Graphical Models for Recovering Probabilistic and Causal Queries from Missing Data

- Computer ScienceNIPS
- 2014

It is shown that causal queries may be recoverable even when the factors in their identifying estimands are not recoverable, and applied to problems of attrition, the recovery of causal effects from data corrupted by attrition is characterized.

### A Linear Non-Gaussian Acyclic Model for Causal Discovery

- Computer ScienceJ. Mach. Learn. Res.
- 2006

This work shows how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-Gaussian distributions of non-zero variances.