Recent Advances on the Machine Learning Methods in Identifying DNA Replication Origins in Eukaryotic Genomics

  title={Recent Advances on the Machine Learning Methods in Identifying DNA Replication Origins in Eukaryotic Genomics},
  author={Fu-Ying Dao and Hao Lv and Fang Wang and Hui Ding},
  journal={Frontiers in Genetics},
The initiate site of DNA replication is called origins of replication (ORI) which is regulated by a set of regulatory proteins and plays important roles in the basic biochemical process during cell growth and division in all living organisms. Therefore, the study of ORIs is essential for understanding the cell-division cycle and gene expression regulation so that scholars can develop a new strategy against genetic diseases by using the knowledge of DNA replication. Thus, the accurate… 

Figures and Tables from this paper

A computational platform to identify origins of replication sites in eukaryotes
The first integrated predictor named iORI-Euk was built to identify ORIs in multiple eukaryotes and multiple cell types and the best results were obtained by using support vector machine in 5-fold cross-validation test and independent dataset test.
Yeast autonomously replicating sequence (ARS): Identification, function, and modification
The identification methods of ARSs are summarized, especially for the bioinformatics prediction methods over the past few years, and ARS modification that combined with the high‐throughput sequencing was elaborated, shedding further light on the understanding of the roles of ARss, and providing deep insights towards the optimization ofARSs.
Reconfiguring Okazaki fragment start sites on a genome by using a data-driven approach
Novel DNA sequences are generated with improved binding of T7 primase and improved RNA primer synthesis, as validated experimentally on the basis of the principles learned about DNA-primase binding.
Reconfiguring primase DNA-recognition sequences by using a data-driven approach
The binding to DNA of T7 primase, as a model system for specific DNA-protein interactions, is described, which triggers the formation of RNA primers that serve as Okazaki fragment start sites during DNA replication.
A Brief Survey for MicroRNA Precursor Identification Using Machine Learning Methods
The review summarizes the current advances in pre-miRNA recognition based on computational methods, including the construction of benchmark datasets, feature extraction methods, prediction algorithms, and the results of the models.
Multiple plasmid origin-of-transfer substrates enable the spread of natural antimicrobial resistance to human pathogens
This work considers that the plasmid-borne origin-of-transfer substrates encode specific DNA structural properties that can facilitate finding these regions in large datasets, and develops a DNA structure-based alignment procedure for typing the transfer substrates that outperforms mere sequence-based approaches.
Multiple plasmid origin‐of‐transfer regions might aid the spread of antimicrobial resistance to human pathogens
A hypothetical network is found to facilitate the transfer of antimicrobial resistance from environmental genetic reservoirs to human pathogens, which might be an important driver of the observed rapid resistance development in humans and thus an important point of focus for future prevention measures.
Machine Learning (ML)‐Assisted Design and Fabrication for Solar Cells
A statistical analysis of the literatures shows that artificial neural network and genetic algorithm are the two most applied ML techniques and the topics in the optimization of device structures and optimization of fabrication processes are more popular.


Recent advances in the genome-wide study of DNA replication origins in yeast
The sequence characteristics and chromosome structures of ORIs in the four yeast species, which can be utilized to improve yeast replication origins prediction, are discussed.
The relationship between DNA replication and human genome organization.
The first large-scale data set of experimentally determined origins of replication in human is analyzed and it is concluded that the impact of DNA replication on human genome organization is considerably weaker than previously proposed.
A genomic view of eukaryotic DNA replication
Multiple microarray-based approaches that have been used to study DNA replication in both S. cerevisiae and higher eukaryotes are described and a powerful new approach to define the mechanisms that regulate replication origin function is proposed.
DeOri: a database of eukaryotic DNA replication origins
A Database of Eukaryotic ORIs (DeOri) is constructed, which contains all the eukaryosis ones identified by genome-wide analyses currently available and helps to reveal the mechanism of the regulation of DNA replication.
iOri-Human: identify human origin of replication by incorporating dinucleotide physicochemical properties into pseudo nucleotide composition
A predictor called “iOri-Human”, where 96 physicochemical properties for the 16 possible constituent dinucleotides have been incorporated to reflect the global sequence patterns in DNA as well as its local sequence patterns.
Transcription Initiation Activity Sets Replication Origin Efficiency in Mammalian Cells
This study reveals that 85% of the replication initiation sites in mouse embryonic stem (ES) cells are associated with transcriptional units, suggesting a co-evolution of the regulatory regions driving replication and transcription in the mouse genome.