• Corpus ID: 1631247

BioMM: Biologically-informed Multi-stage Machine learning for identification of epigenetic fingerprints

  title={BioMM: Biologically-informed Multi-stage Machine learning for identification of epigenetic fingerprints},
  author={Junfang Chen and Emanuel Schwarz},
  journal={arXiv: Quantitative Methods},
The identification of reproducible biological patterns from high-dimensional data is a bottleneck for understanding the biology of complex illnesses such as schizophrenia. To address this, we developed a biologically informed, multi-stage machine learning (BioMM) framework. BioMM incorporates biological pathway information to stratify and aggregate high-dimensional biological data. We demonstrate the utility of this method using genome-wide DNA methylation data and show that it substantially… 
1 Citations

Tables from this paper

Leveraging TCGA gene expression data to build predictive models for cancer drug response
Primary tumor gene expression is a good predictor of cancer drug response and investment in larger datasets containing both patient gene expression and drug response is needed to support future work of machine learning models.


DNA methylation age of human tissues and cell types
It is proposed that DNA methylation age measures the cumulative effect of an epigenetic maintenance system, and can be used to address a host of questions in developmental biology, cancer and aging research.
Diagnostic classification of schizophrenia by neural network analysis of blood-based gene expression signatures
An integrated genetic-epigenetic analysis of schizophrenia: evidence for co-localization of genetic associations and differential DNA methylation
This study represents the first systematic integrated analysis of genetic and epigenetic variation in schizophrenia, introducing a methodological approach that can be used to inform epigenome-wide association study analyses of other complex traits and diseases.
Biological Insights From 108 Schizophrenia-Associated Genetic Loci
Associations at DRD2 and several genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses.
Tobacco Smoking Leads to Extensive Genome-Wide Changes in DNA Methylation
The results of this study confirm the broad effect of tobacco smoking on the human organism, but also show that quitting tobacco smoking presumably allows regaining the DNA methylation state of never smokers.
Epigenetic mechanisms in schizophrenia.
Prenatal infection and schizophrenia: a review of epidemiologic and translational studies.
The promise of this work for facilitating the identification of susceptibility loci in genetic studies of schizophrenia is illustrated by examples of interaction between in utero exposure to infection and genetic variants.
Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
The Gene Expression Omnibus (GEO) project was initiated in response to the growing demand for a public repository for high-throughput gene expression data. GEO provides a flexible and open design
The Dynamics of DNA Methylation in Schizophrenia and Related Psychiatric Disorders
Understanding of altered CpG methylation, hydroxymethylation, and active DNA demethylation provide a framework for the identification of new targets, which may be exploited for the pharmacological intervention of the psychosis associated with SZ and possibly BP+.
Schizophrenia and migration: a meta-analysis and review.
Findings of previous studies implicating migration as a risk factor for the development of schizophrenia and a quantitative index of the associated effect size are synthesized to suggest a role for psychosocial adversity in the etiology of schizophrenia.