Computational genomics

  title={Computational genomics},
  author={Eugene V. Koonin},
  journal={Current Biology},
  • E. Koonin
  • Published 6 March 2001
  • Biology
  • Current Biology

Figures from this paper

Unravelling the ORFan Puzzle

It is demonstrated that ORFans are an untapped source of research, requiring further computational and experimental studies, and some of the studies aimed at understanding ORFans, their functions and their origins are reviewed.

Better prediction of sub‐cellular localization by combining evolutionary and structural information

This work explored the evolutionary information contained in multiple alignments and aspects of protein structure to predict localization in absence of homology and targeting motifs and developed two separate systems that were at its best for extra‐cellular and nuclear proteins and significantly less accurate than TargetP for mitochondrial proteins.

Multiscale DNA partitioning: statistical evidence for segments

This work focuses on partitioning with respect to GC content and proposes a new approach that provides statistical error control, which is based on a statistical multiscale criterion, rendering this as a segmentation method that searches segments of any length (on all scales) simultaneously.

Comparative and Evolutionary Genomics of Pseudomonas syringae

To overcome the computational challenges of large-scale comparative genome analysis, a novel comparative genomic pipeline named DeNoGAP is designed, which provides a robust computational pipeline for performing various comparative genomics tasks, such as gene prediction, ortholog prediction, functional annotation, and so on.

Automatic prediction of protein function

Computational biologists have begun to develop ab initio methods that predict aspects of function, including subcellular localization, post-translational modifications, functional type and protein-protein interactions, where the most accurate approaches rely on identifying short signalling motifs, while the most general methods utilise tools of artificial intelligence.

Quantitative assessment of relationship between sequence similarity and function similarity

This study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity and quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs.

Protein classification using probabilistic chain graphs and the Gene Ontology structure

Results indicate that direct utilization of the Gene Ontology improves predictive ability, outperforming traditional models that do not take advantage of dependencies among functional terms.

Mass Spectrometry‐based Methods of Proteome Analysis

Owing to continuous and rapid improvements in instrument sensitivity, throughput capacity, software versatility, and techniques of statistical validation, MS-based approaches have during recent years become mainstream methods for proteome analysis.

Evaluation of deep learning techniques for identification of sarcoma-causing carcinogenic mutations

The proposed study developed a framework for the early detection of human sarcoma cancer using deep learning Recurrent Neural Network (RNN) algorithms.



Combining diverse evidence for gene recognition in completely sequenced bacterial genomes

A new program ORPHEUS is presented that identifies candidate genes and accurately predicts gene starts and it is shown that the program correctly identified 93.3% of experimentally annotated genes longer than 100 codons described in the PIR-International database and 92.9% of predicted starts coincided with the feature table description.

Computational molecular biology - an algorithmic approach

In one of the first major texts in the emerging field of computational molecular biology, Pavel Pevzner covers a broad range of algorithmic and combinatorial topics and shows how they are connected

A genomic perspective on protein families.

Comparison of proteins encoded in seven complete genomes from five major phylogenetic lineages and elucidation of consistent patterns of sequence similarities allowed the delineation of 720 clusters of orthologous groups (COGs), which comprise a framework for functional and evolutionary genome analysis.

Bioinformatics - a practical guide to the analysis of genes and proteins

This work focuses on the development of novel approaches to biological analysis using Perl to Facilitate Biological Analysis and its applications in proteomics and Protein Identification.

Pattern of selective constraint in C. elegans and C. briggsae genomes.

Similarity between related genomes may carry information on selective constraint in each of them. We analysed patterns of similarity between several homologous regions of Caenorhabditis elegans and

Comparison of the complete protein sets of worm and yeast: orthology and divergence.

Comparative analysis of predicted protein sequences encoded by the genomes of Caenorhabditis elegans and Saccharomyces cerevisiae suggests that most of the core biological functions are carried out

Initial sequencing and analysis of the human genome

The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

Predicting functions from protein sequences—where are the bottlenecks?

The exponential growth of sequence data does not necessarily lead to an increase in knowledge about the functions of genes and their products, so the identification, verification and annotation of functional features need to be drastically improved.

Sequence the Human Genome

This book aims to provide a history of Chinese modern art from 17th Century to the present day through the lens of 20th Century critics, practitioners, journalists, and mediaeval and modern-day critics.

Ouelette BFF: Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins

  • 2001