Corpus ID: 16116745

The Nature of Biomolecular Sequences

  title={The Nature of Biomolecular Sequences},
  author={Dimitris Anastassiou}
G enomics is a highly cross-disciplinary field that creates paradigm shifts in such diverse areas as medicine and agriculture. It is believed that many significant scientific and technological endeavors in the 21st century will be related to the processing and interpretation of the vast information that is currently revealed from sequencing the genomes of many living organisms, including humans. Genomic information is digital in a very real sense; it is represented in the form of sequences of… Expand

Figures and Tables from this paper


New computational and visual tools for biomolecular sequence analysis from the field of digital signal processing are introduced. In particular, we show that not only the magnitude, but also theExpand
Frequency-domain analysis of biomolecular sequences
An optimization procedure improving upon traditional Fourier analysis performance in distinguishing coding from noncoding regions in DNA sequences is provided and it is demonstrated that color spectrograms can visually provide significant information about biomolecular sequences, thus facilitating understanding of local nature, structure and function. Expand
10-11 bp periodicities in complete genomes reflect protein structure and DNA folding
It is shown that correlations within proteins affect mainly the oscillations at distances below 35 bp and the long-ranging correlations up to 100 bp reflect primarily DNA folding, which suggests that while a period of 11 bp in bacteria reflects negative supercoiling, the significantly different period of thermophilic archaea close to 10 bp corresponds to positive super coiling of thermophile archaeal genomes. Expand
Computational methods for the identification of genes in vertebrate genomic sequences.
  • J. Claverie
  • Biology, Medicine
  • Human molecular genetics
  • 1997
If the performances are satisfactory for the identification of the coding moiety of genes (internal coding exons), the determination of the full extent of the transcript (5' and 3' extremities of the gene) and the location of promoter regions are still unreliable. Expand
Prediction of probable genes by Fourier analysis of genomic sequences
The aim is to use Fourier techniques to analyse this periodicity, and thereby to develop a tool to recognize coding regions in genomic DNA, and find that the relative-height of the peak at f = 1/3 in the Fourier spectrum is a good discriminator of coding potential. Expand
Understanding the human genome
Scientists suspecting a genetic correlation with disease can now seek out starting points in the genes of humans and other creatures, compressing what would have been a decade or more of research into a day or two of database queries. Expand
3-, 10.5-, 200- and 400-base periodicities in genome sequences
The above periodicities are the main hidden oscillating patterns detected so far in the genomic sequences, characteristic for the protein-coding sequences only. Expand
Assessment of protein coding measures.
This paper reviews and synthesizes the underlying coding measures from published algorithms and concludes that a very simple and obvious measure--counting oligomers--is more effective than any of the more sophisticated measures. Expand
Short-range order in two eukaryotic genomes: relation to chromosome structure.
  • J. Widom
  • Biology, Medicine
  • Journal of molecular biology
  • 1996
The results suggest that the requirements of chromosome structure place significant constraints on eukaryotic genome organization; they reveal additional signals that may be related to nucleosome positioning; and they reveal a wealth of additional new non-random aspects of genome sequence organization. Expand
Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification.
  • J. C. Shepherd
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 1981
The periodic variations obtained by correlating the relative positions of purines and pyrimidines in a wide variety of genomes suggest that there may be enough of an earlier comma-free coding system still present to permit determination of the reading frame and approximate extent of the present protein coding stretches. Expand