Multi-omic data integration enables discovery of hidden biological regularities

  title={Multi-omic data integration enables discovery of hidden biological regularities},
  author={Ali Ebrahim and Elizabeth Brunk and Justin Tan and Edward J. O'Brien and Donghyuk Kim and Richard Szubin and Joshua A. Lerman and Anna Lechner and Anand V. Sastry and Aarash Bordbar and Adam M. Feist and Bernhard O. Palsson},
  journal={Nature Communications},
Rapid growth in size and complexity of biological data sets has led to the ‘Big Data to Knowledge' challenge. We develop advanced data integration methods for multi-level analysis of genomic, transcriptomic, ribosomal profiling, proteomic and fluxomic data. First, we show that pairwise integration of primary omics data reveals regularities that tie cellular processes together in Escherichia coli: the number of protein molecules made per mRNA transcript and the number of ribosomes required per… 

A study on multi-omic oscillations in Escherichia coli metabolic networks

The integration of multi-omic data reveals that E.coli multi-omics metabolic networks contain position dependent and recurring patterns which could provide clues of long range correlations in the bacterial genome.

Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq.

This work exploits transcript-level expression from RNA-seq to set prior likelihoods and enable protein isoform abundances to be directly estimated from LC-MS/MS, an approach derived from the principle that most genes appear to be expressed as a single dominant isoform in a given cell type or tissue.

Synthesizing Systems Biology Knowledge from Omics Using Genome‐Scale Models

The authors find that concurrent with advancements in omic technologies, genome‐scale modeling methods are also expanding to enable better interpretation of omic data and continued synthesis of valuable knowledge, through the integration of Omic data with GEMs, are expected.

Genome-Scale Metabolic Modeling Enables In-Depth Understanding of Big Data

The available Big Data useful for metabolic modeling and the available GEM reconstruction tools that integrate Big Data are analyzed to provide a perspective in emerging areas, such as annotation, data managing, and machine learning, in which GEMs will play a key role in the further utilization of Big Data.

Recon3D enables a three-dimensional view of gene variation in human metabolism

Recon3D is presented, a computational resource that includes three-dimensional metabolite and protein structure data and enables integrated analyses of metabolic functions in humans, and is used to functionally characterize mutations associated with disease, and identify metabolic response signatures that are caused by exposure to certain drugs.

CoMetGeNe: mining conserved neighborhood patterns in metabolic and genomic contexts

CoMetGeNe is an exploratory tool at both the genomic and the metabolic levels, leading to insights into the conservation of functionally related clusters of neighboring enzyme-coding genes.

Optimization of Multi-Omic Genome-Scale Models: Methodologies, Hands-on Tutorial, and Perspectives.

A review of the principal methods used for constraint-based modelling in systems biology, and how the integration of multi-omic data can be used to improve phenotypic predictions of genome-scale metabolic models is explored.

A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research

This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy, and show that the performed supervised and unsupervised analyses result in meaningful and novel findings.



Systems biology of the structural proteome

GEM-PRO offers insight into the physical embodiment of an organism’s genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research.

Analysis of omics data with genome-scale models of metabolism.

Biochemically, genetically, and genomically consistent knowledge bases are increasingly being used to extract deeper biological knowledge and understanding from these data sets than possible by inferential methods, largely due to knowledge bases providing a validated biological context for interpreting the data.

Global analysis of protein expression in yeast

A Saccharomyces cerevisiae fusion library is created where each open reading frame is tagged with a high-affinity epitope and expressed from its natural chromosomal location, and it is found that about 80% of the proteome is expressed during normal growth conditions.

The model organism as a system: integrating 'omics' data sets

Researchers are rising to the challenge by using omics data integration to address fundamental biological questions that would increase the understanding of systems as a whole.

Genome-scale models of metabolism and gene expression extend and refine growth phenotype prediction

An ME‐Model for Escherichia coli is constructed—a genome‐scale model that seamlessly integrates metabolic and gene product expression pathways and formalizes the principle of growth optimization to enable the accurate prediction of multi‐scale phenotypes.

An integrative, multi-scale, genome-wide model reveals the phenotypic landscape of Escherichia coli

This work presents an integrative modeling methodology that unifies under a common framework the various biological processes and their interactions across multiple layers and paves the way toward integrative techniques that extract knowledge from a variety of biological data to achieve more than the sum of their parts in the context of prediction, analysis, and redesign of biological systems.

A streamlined ribosome profiling protocol for the characterization of microorganisms.

This streamlined workflow enables greater throughput, cuts the time from harvest to the final library in half (down to 3-4 days), and generates a high fraction of informative reads, all while retaining the high quality standards of the existing protocol.

Global quantification of mammalian gene expression control

Using a quantitative model, the first genome-scale prediction of synthesis rates of mRNAs and proteins is obtained and it is found that the cellular abundance of proteins is predominantly controlled at the level of translation.

Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast

Comparison of protein levels of essentially all endogenous proteins in haploid yeast cells to their diploid counterparts spans more than four orders of magnitude in protein abundance with no discrimination against membrane or low level regulatory proteins.

The quantitative and condition-dependent Escherichia coli proteome

This work uses efficient protein extraction and sample fractionation, as well as state-of-the-art quantitative mass spectrometry techniques to generate a comprehensive, condition-dependent protein-abundance map for Escherichia coli, uncovering system-wide proteome allocation, expression regulation and post-translational adaptations.