Using a VOM model for reconstructing potential coding regions in EST sequences

  title={Using a VOM model for reconstructing potential coding regions in EST sequences},
  author={A. Shmilovici and Irad Ben-Gal},
  journal={Computational Statistics},
  • A. Shmilovici, Irad Ben-Gal
  • Published 2007
  • Computer Science
  • Computational Statistics
  • This paper presents a method for annotating coding and noncoding DNA regions by using variable order Markov (VOM) models. A main advantage in using VOM models is that their order may vary for different sequences, depending on the sequences’ statistics. As a result, VOM models are more flexible with respect to model parameterization and can be trained on relatively short sequences and on low-quality datasets, such as expressed sequence tags (ESTs). The paper presents a modified VOM model for… CONTINUE READING
    17 Citations

    Figures, Tables, and Topics from this paper

    Gene-finding with the VOM model
    • 4
    • PDF
    Single Species Gene Finding
    Representing higher-order dependencies in networks
    • 86
    • PDF
    High-Order Entropy-Based Population Diversity Measures in the Traveling Salesman Problem
    • Y. Nagata
    • Computer Science, Medicine
    • Evolutionary Computation
    • 2020
    • 1
    A boosting method with asymmetric mislabeling probabilities which depend on covariates
    • K. Hayashi
    • Mathematics, Computer Science
    • Comput. Stat.
    • 2012
    • 5
    Representing Big Data as Networks: New Methods and Insights
    • J. Xu
    • Computer Science, Physics
    • ArXiv
    • 2017
    • PDF
    Latent Markovian Modelling and Clustering for Continuous Data Sequences


    ESTScan: A Program for Detecting, Evaluating, and Reconstructing Potential Coding Regions in EST Sequences
    • 1,042
    • Highly Influential
    • PDF
    A VOM based gene-finder that specializes in short genes
    • 1
    Interpolated markov chains for eukaryotic promoter recognition
    • 116
    • PDF
    ExonHunter: a comprehensive approach to gene finding
    • 56
    • Highly Influential
    • PDF
    DIANA-EST: a statistical analysis
    • 36
    • Highly Influential
    Variations on probabilistic suffix trees: statistical modeling and prediction of protein families
    • 186
    • Highly Influential
    • PDF
    Assessment of protein coding measures.
    • 408
    EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance
    • 175
    • PDF