Efficient inference for sparse latent variable models of transcriptional regulation

  title={Efficient inference for sparse latent variable models of transcriptional regulation},
  author={Zhenwen Dai and Mudassar Iqbal and Neil D. Lawrence and Magnus Rattray},
  pages={3776 - 3783}
Abstract Motivation Regulation of gene expression in prokaryotes involves complex co-regulatory mechanisms involving large numbers of transcriptional regulatory proteins and their target genes. Uncovering these genome-scale interactions constitutes a major bottleneck in systems biology. Sparse latent factor models, assuming activity of transcription factors (TFs) as unobserved, provide a biologically interpretable modelling framework, integrating gene expression and genome-wide binding data… 

Figures from this paper

Multi-study inference of regulatory networks for more accurate models of gene regulation

This study explores previous integration strategies, such as batch-correction and model ensembles, and introduces a new multitask learning approach for joint network inference across several datasets, and demonstrates robustness to both false positives in the prior information and heterogeneity among datasets.

Probing 3’UTRs as modular regulators of gene expression

The results show that the effect of a motif on RNA abundance depends both on its host terminator, and also on the associated promoter sequence, which emphasises the need for improved motif inference algorithms that include both local and global context effects.

Limitations of composability of cis-regulatory elements in messenger RNA

The results show that the effect of a motif on RNA abundance depends both on its host terminator, and also on the associated promoter sequence, which emphasises the need for improved motif inference that includes both local and global context effects, which could aid in the accurate use of CREs for the engineering of synthetic genetic constructs.



Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities

A probabilistic state space model is developed that allows genome-wide inference of both transcription factor protein concentrations and their effect on the transcription rates of each target gene from microarray data and predictions from the model are consistent with the underlying biology and offer novel quantitative insights into the regulatory structure of the yeast cell.

A prior-based integrative framework for functional transcriptional regulatory network inference

A regulatory network inference algorithm is developed, based on probabilistic graphical models, to integrate expression with auxil-iary datasets supporting a regulatory edge, and natural genetic variation is suggested as the most informative perturbation for network inference.

Large-scale learning of combinatorial transcriptional dynamics from gene expression

A novel method to infer combinatorial regulation of gene expression by multiple transcription factors in large-scale transcriptional regulatory networks is presented, implementing a factorial hidden Markov model with a non-linear likelihood to represent the interactions between the hidden transcription factors.

Factor analysis for gene regulatory networks and transcription factor activity profiles

This paper explores the performance of five factor analysis algorithms, Bayesian as well as classical, on problems with biological context using both simulated and real data and demonstrates that if the underlying network is sparse it is still possible to reconstruct hidden activity profiles of TFs to some degree without prior connectivity information.

Bayesian sparse hidden components analysis for transcription regulation networks

This work applies a Bayesian hidden component model for the expression array data to identify which of the potential binding sites are actually used by the regulatory proteins in the studied cell conditions, the strength of their control, and their activation profile in a series of experiments.

Wisdom of crowds for robust gene network inference

A comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data defines the performance, data requirements and inherent biases of different inference approaches, and provides guidelines for algorithm application and development.

High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

These case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers.

Scalable latent-factor models applied to single-cell RNA-seq data separate biological drivers from confounding effects

A computationally efficient model is described that uses prior pathway annotation to guide inference of the biological drivers underpinning the heterogeneity in single-cell RNA-sequencing and can robustly decompose scRNA-seq datasets into interpretable components and facilitate the identification of novel sub-populations.

Revealing strengths and weaknesses of methods for gene network inference

The results of this community-wide experiment show that reliable network inference from gene expression data remains an unsolved problem, and they indicate potential ways of network reconstruction improvements.

An experimentally supported model of the Bacillus subtilis global transcriptional regulatory network

A new combination of network component analysis and model selection is used to simultaneously estimate transcription factor activities and learn a substantially expanded transcriptional regulatory network for this bacterium, significantly increasing the understanding of various cell processes, such as spore formation.