Prediction of methylated CpGs in DNA sequences using a support vector machine

@article{Bhasin2005PredictionOM,
  title={Prediction of methylated CpGs in DNA sequences using a support vector machine},
  author={Manoj K. Bhasin and Hong Zhang and Ellis L. Reinherz and Pedro A. Reche},
  journal={FEBS Letters},
  year={2005},
  volume={579}
}
DNA methylation plays a key role in the regulation of gene expression. [...] Key Method Initially a SVM module was developed from human data for the prediction of human-specific methylation sites. This module achieved a MCC and AUC of 0.501 and 0.814, respectively, when evaluated using a 5-fold cross-validation. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics…Expand
Prediction of methylation CpGs and their methylation degrees in human DNA sequences
TLDR
The trinucleotide composition, a 64-dimensional feature vector of the occurrence frequency of 64 trin nucleotides in the DNA sequence, was utilized to model SVM for the prediction of CpG methylation degrees in humans and good results indicated that the proposed method was a useful tool for the investigation of DNA methylation prediction research.
Predicting methylation status of human DNA sequences by pseudo-trinucleotide composition.
TLDR
Good prediction results reveal that the pseudo-trinucleotide composition is an effective representation method for DNA sequence and plays a very important role in the prediction of DNA function.
The prediction of methylation states in human DNA sequences based on hexanucleotide composition and feature selection
TLDR
In this study, the hexanucleotide composition is utilized to characterize the DNA sequences and an improved genetic algorithm is employed to obtain the optimal feature subset from the preselected feature subset and the parameters of the support vector machine.
A Predictive Model for Genomic Methylation Targets in Humans
TLDR
Post-classification analysis of the correctly and mistakenly classified methylation targets revealed that the classification accuracy varies depending on the co-localization of the CpG sites with genomic features involved in gene expression, such as C pG islands and exonic/intronic sequences, which reinforces the importance of incorporating site-specific features in the predictive model and the development of site- specific classifiers.
Predicting methylation status of CpG islands in the human brain
TLDR
A classifier called MethCGI is developed for predicting methylation status of CpG islands using a support vector machine (SVM) and achieves specificity and sensitivity on the brain data, and can also correctly predict about two-third of the data from other tissues reported in the MethDB database.
Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks
TLDR
Deep learning based (stacked denoising autoencoders, or SdAs) software named “DeepMethyl” is developed to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns.
CpGIMethPred: computational model for predicting methylation status of CpG islands in human genome
TLDR
CpGIMethPred, the support vector machine-based models to predict the methylation status of the CpG islands in the human genome under normal conditions are developed and can achieve higher specificity and accuracy than the existing models while maintaining a comparable sensitivity measure.
Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human
TLDR
A novel algorithm is proposed that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human.
A Human DNA Methylation Site Predictor Based on SVM
  • Yi-Ming Sun, W. Liao, +4 authors Li-Ching Wu
  • Biology, Computer Science
    2009 Ninth IEEE International Conference on Bioinformatics and BioEngineering
  • 2009
TLDR
It is proposed that the differential features or methylations vary between the different regions because the features common to each DNA region made up only 50% of the top 70 features.
Enhancement on the predictive power of the prediction model for human genomic DNA methylation
DNA methylation is an important type of epigenetic modification that plays an instrumental role in organogenesis, cellular differentiation, suppression of deleterious elements, and carcinogenesis. In
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 50 REFERENCES
Tumour class prediction and discovery by microarray-based DNA methylation analysis.
TLDR
This work has developed the first microarray-based technique which allows genome-wide assessment of selected CpG dinucleotides as well as quantification of methylation at each site, demonstrating that genome- wide analysis ofmethylation patterns combined with supervised and unsupervised machine learning techniques constitute a powerful novel tool to classify human cancers.
An improved version of the DNA methylation database (MethDB)
TLDR
The DNA Methylation database (MethDB) is currently the only public database for DNA methylation and contains currently methylation patterns, profiles and total methylation content data for 46 species, 160 tissues and 72 phenotypes coming from a total of 6667 experiments.
Aberrant CpG-island methylation has non-random and tumour-type–specific patterns
TLDR
This report reports a global analysis of the methylation status of 1,184 unselected CpG islands in each of 98 primary human tumours using restriction landmark genomic scanning (RLGS), and estimates that an average of 600 C pG islands were aberrantly methylated in the tumours, including early stage tumours.
DNA methylation and chromatin structure: The puzzling CpG islands
TLDR
The aim is to gather some mechanisms regarding this intriguing enigma regarding the mechanism(s) whereby CpG islands, which remain protected from methylation in normal cells, are susceptible tomethylation in tumor cells.
Predicting aberrant CpG island methylation
TLDR
The data indicate that CpG islands differ in their intrinsic susceptibility to de novo methylation, and suggest that the propensity for a C pG island to become aberrantly methylated can be predicted based on its sequence context.
Age-dependent DNA methylation changes in the ITGAL (CD11a) promoter
TLDR
The results indicate that hypomethylation of regions flanking the ITGAL promoter may increase CD11a expression, and suggest that age-dependent hypometrifying of promoters lacking CpG islands, perhaps due to decreased DNA methyltransferase expression, may be one mechanism contributing to increased T cell gene expression with aging.
Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.
TLDR
Analysis of genomic methylation in transgenic Drosophila expressing Dnmt3a reveals that DnMT3a is predominantly a CpG methylase but also is able to induce methylation at CpA and at CPT.
Comprehensive analysis of CpG islands in human chromosomes 21 and 22
  • D. Takai, Peter A. Jones
  • Biology, Medicine
    Proceedings of the National Academy of Sciences of the United States of America
  • 2002
TLDR
The complete genomic sequences of human chromosomes 21 and 22 are used to examine the properties of CpG islands in different sequence classes by using a search algorithm that is compatible with the recent detection of 5-methylcytosine in Drosophila, and might suggest that S. cerevisiae has, or once had, C pG methylation.
Aging, methylation and cancer.
TLDR
The concept that age-related methylation is a predisposing factor for neoplasia implies that it may serve as a diagnostic risk marker in cancer, and as a novel target for chemoprevention in humans.
Knowledge-based analysis of microarray gene expression data by using support vector machines.
  • M. P. Brown, W. Grundy, +5 authors D. Haussler
  • Computer Science, Medicine
    Proceedings of the National Academy of Sciences of the United States of America
  • 2000
TLDR
A method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments, based on the theory of support vector machines (SVMs), to predict functional roles for uncharacterized yeast ORFs based on their expression data is introduced.
...
1
2
3
4
5
...