Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data

@article{Wang2016EfficientAA,
  title={Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data},
  author={Lingfei Wang and Tom Michoel},
  journal={PLoS Computational Biology},
  year={2016},
  volume={13}
}
Mapping gene expression as a quantitative trait using whole genome-sequencing and transcriptome analysis allows to discover the functional consequences of genetic variation. We developed a novel method and ultra-fast software Findr for higly accurate causal inference between gene expression traits using cis-regulatory DNA variations as causal anchors, which improves current methods by taking into consideration hidden confounders and weak regulations. Findr outperformed existing methods on the… 

Figures from this paper

Whole-Transcriptome Causal Network Inference with Genomic and Transcriptomic Data.

This work demonstrates the reconstruction of causal gene networks with program Findr on 3000 genes from the Geuvadis dataset, and reveals major hub genes in the reconstructed network.

Whole-transcriptome causal network inference with genomic and transcriptomic data

This work demonstrates the reconstruction of causal gene networks with program Findr on 3,000 genes from the Geuvadis dataset and reveals major hub genes in the reconstructed network.

Robust discovery of causal gene networks via measurement error estimation and correction

A new framework for causal discovery that is robust against measurement noise by extending an established statistical approach CIT (Causal Inference Test) is proposed, and a two-stage approach called RCD (Robust Causal Discovery), wherein the first estimate measurement error from gene expression data and then incorporate it to get consistent parameter estimates that could be used with appropriately extended statistical tests of correlation or mediation done in the original CIT is developed.

Causal gene regulatory network inference using enhancer activity as a causal anchor

It is proposed that a causal inference framework successfully used for eQTL data can be extended to infer causal regulatory networks using enhancers as causal anchors and enhancer RNA expression as a readout of enhancer activity.

Comparison between instrumental variable and mediation-based methods for reconstructing causal gene networks in yeast

In conclusion, causal inference from genomics and transcriptomics data is a powerful approach for reconstructing causal gene networks, which could be further improved by the development of methods to control for residual correlations in mediation analyses and genomic linkage and pleiotropic effects from transcriptional hotspots in instrumental variable analyses.

Causal Transcription Regulatory Network Inference Using Enhancer Activity as a Causal Anchor

Predicted causal targets of transcription factors (TFs) in mouse embryonic stem cells, macrophages and erythroblastic leukaemia overlapped significantly with experimentally-validated targets from ChIP-seq and perturbation data, demonstrating that variability within a cell type is highly relevant for target prediction of cell type-specific factors.

Comparison between instrumental variable and mediation-based methods for reconstructing causal gene networks in yeast.

In conclusion, causal inference from genomics and transcriptomics data is a powerful approach for reconstructing causal gene networks, which could be further improved by the development of methods to control for residual correlations in mediation analyses, and for genomic linkage and pleiotropic effects from transcriptional hotspots in instrumental variable analyses.

Learning causal biological networks with the principle of Mendelian randomization

This work extends the interpretation of the Principle of Mendelian randomization (PMR) and presents MRPC, a novel machine learning algorithm that incorporates the PMR in classical algorithms for learning causal graphs in computer science.

eQTLs as causal instruments for the reconstruction of hormone linked gene networks

It is demonstrated how causal inference and gene networks can be used to describe the impact of hormone linked genetic variation upon the transcriptome within an endocrinology context.

Learning Causal Biological Networks With the Principle of Mendelian Randomization

Although large amounts of genomic data are available, it remains a challenge to reliably infer causal (i. e., regulatory) relationships among molecular phenotypes (such as gene expression),

Transcriptome and genome sequencing uncovers functional variation in humans

Se sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project—the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences discover extremely widespread genetic variation affecting the regulation of most genes.

An integrative genomics approach to infer causal associations between gene expression and disease

It is shown that this approach can predict transcriptional responses to single gene–perturbation experiments using gene-expression data in the context of a segregating mouse population and the utility of this approach is demonstrated by identifying and experimentally validating the involvement of three new genes in susceptibility to obesity.

Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets

A method is proposed that integrates summary-level data from GWAS with data from expression quantitative trait locus (eQTL) studies to identify genes whose expression levels are associated with a complex trait because of pleiotropy, and prioritize 126 genes that provide important leads to design future functional studies.

Modeling Causality for Pairs of Phenotypes in System Genetics

Tests of causal direction for a pair of phenotypes that may be embedded in a more complicated but unobserved network by extending Vuong’s selection tests for misspecified models are developed, showing greatly reduced false-positive rates compared to the alternative approaches.

An effective framework for reconstructing gene regulatory networks from genetical genomics data

This work presents a framework for reconstructing gene regulatory networks from genetical genomics data where genotype and phenotype correlation measures are used to derive an initial graph which is subsequently reduced by pruning strategies to minimize false positive predictions.

Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques

This work shows how integrative genomics approaches can be used to characterize the role played by approximately a third of registered mouse miRNAs within the context of a liver gene regulatory network and provides evidence supporting the hypothesis that miRNAAs can act cooperatively or redundantly to regulate a given pathway.

How to infer gene networks from expression profiles

It is shown that reverse‐engineering algorithms are indeed able to correctly infer regulatory interactions among genes, at least when one performs perturbation experiments complying with the algorithm requirements.

Gene Regulatory Network Reconstruction Using Bayesian Networks, the Dantzig Selector, the Lasso and Their Meta-Analysis

A simple yet very powerful meta-analysis is proposed, which combines a wide panel of methods ranging from Bayesian networks to penalised linear regressions to analyse gene regulatory networks from different genetical genomics data sets and was ranked first among the teams participating in Challenge 3A.

The role of regulatory variation in complex traits and disease

Recent insights into the molecular nature of regulatory variants are reviewed and examples of complete chains of causality that link individual polymorphisms to changes in gene expression, which in turn result in physiological changes and, ultimately, disease risk are presented.
...