• Corpus ID: 215416270

Directed Graphical Models and Causal Discovery for Zero-Inflated Data

@article{Yu2020DirectedGM,
  title={Directed Graphical Models and Causal Discovery for Zero-Inflated Data},
  author={Shiqing Yu and Mathias Drton and Ali Shojaie},
  journal={arXiv: Methodology},
  year={2020}
}
Modern RNA sequencing technologies provide gene expression measurements from single cells that promise refined insights on regulatory relationships among genes. Directed graphical models are well-suited to explore such (cause-effect) relationships. However, statistical analyses of single cell data are complicated by the fact that the data often show zero-inflated expression patterns. To address this challenge, we propose directed graphical models that are based on Hurdle conditional… 

Figures from this paper

Sequential Learning of the Topological Ordering for the Linear Non-Gaussian Acyclic Model with Parametric Noise

TLDR
This paper develops a novel sequential approach to estimate the causal ordering of a DAG using a linear structural equation model with non-Gaussian noise, a model known as the Linear Non- Gaussian Acyclic Model (LiNGAM).

References

SHOWING 1-10 OF 23 REFERENCES

GRAPHICAL MODELS FOR ZERO-INFLATED SINGLE CELL GENE EXPRESSION.

TLDR
A multivariate Hurdle model is proposed, comprised of a mixture of singular Gaussian distributions, that infers network structure not revealed by other methods; or in bulk data sets, that is more sensitive than existing approaches in simulations.

High-dimensional causal discovery under non-Gaussianity

TLDR
This work considers graphical models based on a recursive system of linear structural equations and proposes an algorithm that yields consistent estimates of the graph also in high-dimensional settings in which thenumber of variables may grow at a faster rate than the number of observations, but inWhich the underlying causal structure features suitable sparsity.

A Linear Non-Gaussian Acyclic Model for Causal Discovery

TLDR
This work shows how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-Gaussian distributions of non-zero variances.

Causal discovery with continuous additive noise models

TLDR
If the observational distribution follows a structural equation model with an additive noise structure, the directed acyclic graph becomes identifiable from the distribution under mild conditions, which constitutes an interesting alternative to traditional methods that assume faithfulness and identify only the Markov equivalence class of the graph, thus leaving some edges undirected.

Identifiability of Gaussian structural equation models with equal error variances

TLDR
This work proves full identifiability in the case where all noise variables have the same variance: the directed acyclic graph can be recovered from the joint Gaussian distribution.

Identifiability of Gaussian Structural Equation Models with Dependent Errors Having Equal Variances

In this paper, we prove that some Gaussian structural equation models with dependent errors having equal variances are identifiable from their corresponding Gaussian distributions. Specifically, we

Scale-free networks in cell biology

TLDR
The observed topologies of cellular networks give clues about their evolution and how their organization influences their function and dynamic responses, and the opportunity to describe quantitatively a network of hundreds or thousands of interacting components is offered.

A Simple Approach for Finding the Globally Optimal Bayesian Network Structure

TLDR
It is shown that it is possible to learn the best Bayesian network structure with over 30 variables, which covers many practically interesting cases and offers a possibility for efficient exploration of the best networks consistent with different variable orderings.

Generalized Score Matching for Non-Negative Data

TLDR
This paper gives a generalized form of score matching for non-negative data that improves estimation efficiency and addresses an overlooked inexistence problem by generalizing the regularized score matching method of Lin et al. (2016) and improving its theoretical guarantees fornon-negative Gaussian graphical models.

Optimal Structure Identification With Greedy Search

TLDR
This paper proves the so-called "Meek Conjecture", which shows that if a DAG H is an independence map of another DAG G, then there exists a finite sequence of edge additions and covered edge reversals in G such that H remains anindependence map of G and after all modifications G =H.