DeepChrome: deep-learning for predicting gene expression from histone modifications

  title={DeepChrome: deep-learning for predicting gene expression from histone modifications},
  author={Ritambhara Singh and Jack Lanchantin and Gabriel Robins and Yanjun Qi},
  volume={32 17},
MOTIVATION Histone modifications are among the most important factors that control gene regulation. Computational methods that predict gene expression from histone modification signals are highly desirable for understanding their combinatorial effects in gene regulation. This knowledge can help in developing 'epigenetic drugs' for diseases like cancer. Previous studies for quantifying the relationship between histone modifications and gene expression levels either failed to capture… 

Figures and Tables from this paper

DeepDiff: DEEP‐learning for predicting DIFFerential gene expression from histone modifications

A novel attention‐based deep learning architecture, DeepDiff, is developed that provides a unified and end‐to‐end solution to model and to interpret how dependencies among histone modifications control the differential patterns of gene regulation.

Accurate and highly interpretable prediction of gene expression from histone modifications

ShallowChrome is proposed, a novel computational pipeline to model transcriptional regulation via HMs in both an accurate and interpretable way, and state-of-the-art results on the binary classification of gene transcriptional states over 56 cell-types from the REMC database, largely outperforming recent deep learning approaches.

SimpleChrome: Encoding of Combinatorial Effects for Predicting Gene Expression

SimpleChrome is presented, a deep learning model that learns the latent histone modification representations of genes that allow us to better understand the combinatorial effects of cross-gene interactions and direct gene regulation on the target gene expression.

Learning the histone codes of gene regulation with large genomic windows and three-dimensional chromatin interactions using transformer

The study shows the great power of attention-based deep learning as a versatile modeling approach for the complex epigenetic landscape of gene regulation and highlights its potential as an effective toolkit that facilitates scientific discoveries in computational epigenetics.

Learning the histone codes with large genomic windows and three-dimensional chromatin interactions using transformer

A transformer-based, three-dimensional chromatin conformation-aware deep learning architecture that achieves the state-of-the-art performance in the quantitative deciphering of the histone codes in gene regulation, highlighting the great advantage of attention-based deep modeling of complex interactions in epigenomes.

DeepChrome 2.0: Investigating and Improving Architectures, Visualizations, & Experiments

Results from cross-cell prediction experiments suggest the relationship between histone modification signals and gene expression is independent of cell type, and the PyTorch re-implementation of DeepChrome is released.

Gene Expression Prediction using Stacked Temporal Convolutional Network

This study proposed to transform the Histone Modification data into one-dimensional space, and predicted the gene expression by using Temporal Convolutional Networks, and experiment results reveal that this approach is superior in terms of AUC score, accuracy, precision, recall, f-score, and specificity against the state-of-the-art-method.

Machine learning for deciphering cell heterogeneity and gene regulation

An overview of state-of-the-art computational methods and their underlying statistical concepts, which range from matrix factorization and regularized linear regression to deep learning methods are provided.



Histone modification levels are predictive for gene expression

It is found that histone modification levels and gene expression are very well correlated, and it is shown that only a small number of histone modifications are necessary to accurately predict gene expression.

Modeling gene expression using chromatin features in various cellular contexts

This study builds a novel quantitative model and finds that expression status and expression levels can be predicted by different groups of chromatin features, both with high accuracy, and that expression levels measured by CAGE are better predicted than by RNA-PET or RNA-Seq.

Deep learning of the tissue-regulated splicing code

The deep architecture surpasses the performance of the previous Bayesian method for predicting AS patterns and demonstrates that deep architectures can be beneficial, even with a moderately sparse dataset.

Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models

The method improves the gene expression prediction in relation to the use of a single linear model, as often used by previous approaches, and recovered the known role of the modifications H3K4me3 and H3k27me3 in activating cell specific genes and of some transcription factors related to CD4+ T differentiation.

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

This work shows that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery.

Combinatorial Roles of DNA Methylation and Histone Modifications on Gene Expression

This work derived \(83\) rules and identified some key PTMs that have considerable effects such as H2BK5ac, H3K79me123, H4K91ac, and H3 k4me3 that can explain the low expression of genes in CD4\(+\) T cell.

A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets

A statistical framework is developed to study the relationship between chromatin features and gene expression that can be used to predict gene expression of protein coding genes, as well as microRNAs, including modENCODE worm datasets.

Defining the chromatin signature of inducible genes in T cells

The results suggest that the majority of inducible genes are primed for activation by having an active chromatin signature and promoter Pol II with or without ongoing elongation.

Gene Expression Differences Among Primates Are Associated With Changes in a Histone Epigenetic Modification

A modest, yet important role is suggested for epigenetic changes in gene expression differences between primates, based on an epigenetic histone modification, H3K4me3, which is thought to promote transcription.

The correlation between histone modifications and gene expression.

Combinations of histone marks are indicative of transcriptional states, and these loci coincide with genes that are actively transcribed in neural precursor cells.