On the origin of coding sequences from random open reading frames

@article{Hglund2005OnTO,
  title={On the origin of coding sequences from random open reading frames},
  author={Mattias H{\"o}glund and Torbj{\"o}rn S{\"a}ll and Dan R{\"o}hme},
  journal={Journal of Molecular Evolution},
  year={2005},
  volume={30},
  pages={104-108}
}
SummaryThe size distribution of 411 randomly selected mammalian exons was investigated. This distribution was found to be unimodal with a frequency maximum of 120 bp. Detailed analysis of the distribution demonstrated that larger exons (>150 bp) have a high goodness of fit to the size distribution of open reading frames (ORFs) in a random sequence, i.e., (61/64)t in which t is the number of triplets. Based on this observation, the general character of the total exon size distribution suggested… Expand

Figures and Tables from this paper

A relationship between GC content and coding-sequence length
TLDR
The analysis of DNA sequences from several genome databases stratified according to GC content reveals that the longest coding sequences—exons in vertebrates and genes in prokaryotes—are GC-rich, while the shortest ones are GC-poor, a function of GC content. Expand
Statistical analysis and prediction of the exonic structure of human genes
  • M. Gelfand
  • Biology, Medicine
  • Journal of Molecular Evolution
  • 2004
TLDR
The analysis of the codon usage in the signal positions leads to the conclusion that the prevalence of some amino acids in corresponding protein sites is caused by the signal requirements and not vice versa. Expand
Spontaneous symmetry breaking in genome evolution
TLDR
This work combines an analysis of the most recent achievements of genomics and fundamental concepts of random processes to provide a novel point of view on genome evolution. Expand
Endogenous mechanisms for the origins of spliceosomal introns.
TLDR
Two further hypotheses are proposed that are broadly based on central cellular processes: 1) internal gene duplication and 2) the response to aberrant and fortuitously spliced transcripts provide a powerful way to explain the establishment of spliceosomal introns in eukaryotes without invoking an exogenous source. Expand

References

SHOWING 1-10 OF 95 REFERENCES
Origin of eukaryotic introns: a hypothesis, based on codon distribution statistics in genes, and its implications.
  • P. Senapathy
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 1986
TLDR
Introns are suggested to be those stretches of sequences containing interfering stop codons that were originally earmarked in the first primitive cells to be eliminated in order to enable the coding for long polypeptide in eukaryotic genes. Expand
Proteins of Escherichia coli come in sizes that are multiples of 14 kDa: domain concepts and evolutionary implications.
  • M. Savageau
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 1986
TLDR
The distribution of gene lengths for E. coli suggests regular clustering, which implies that the clustering of protein molecular masses is not an artifact of the molecular mass measurement by gel electrophoresis, and suggests the existence of a fundamental structural unit. Expand
Relationship between the total size of exons and introns in protein-coding genes of higher eukaryotes.
  • H. Naora, N. Deacon
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 1982
TLDR
It is proposed that conservation of sequences, which is required by the family members, internal repeats, or the entire gene, would actually motivate the removal of introns. Expand
Structure of vertebrate genes: A statistical analysis implicating selection
  • M. Smith
  • Biology, Medicine
  • Journal of Molecular Evolution
  • 2005
TLDR
Introns occur at nonrandom frequencies within the codon frame, in untranslated regions, and relative to the frameshift potential from exon movement or duplication, demonstrating that models of gene evolution must incorporate selective processes. Expand
Correlation of DNA exonic regions with protein structural units in haemoglobin
  • M. Go
  • Biology, Medicine
  • Nature
  • 1981
TLDR
In haemoglobin, which has no obvious domain structure, no clear conformational characteristics have so far been recognized for the segments encoded by exons, but a close inspection of their conformations by drawing various stereodiagrams and the Cα–Cα distance map is proposed. Expand
The nucleotide sequence of the chicken thymidine kinase gene and the relationship of its predicted polypeptide to that of the vaccinia virus thymidine kinase.
TLDR
The entire DNA nucleotide sequence of a 3.0 kilobase pair Hind III fragment containing the chicken cytoplasmic thymidine kinase gene was determined and the proposed 244 amino acid polypeptide encoded by this gene bears strong homology to the vaccinia virus thymazine kinase. Expand
Do exons code for structural or functional units in proteins?
  • T. Traut
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 1988
TLDR
The available data show that exons are fairly limited in size but are large enough to specify structure-function modules in proteins, and that it is possible that the observed relationship of exons to protein structure represents a degenerate state of an ancestral correspondence between exons and structure- function modules in protein. Expand
Nucleotide sequence of the human c‐myc locus: provocative open reading frame within the first exon.
TLDR
The nucleotide sequence of a HindIII‐EcoRI DNA fragment, 8 kbp long, of a lambda recombinant containing the whole human c‐myc gene has been deduced by the method of Maxam and Gilbert and speculations about the role of that putative protein on the regulation of the expression of exons 2 and 3 are made. Expand
Evolution and organization of the human protein C gene.
TLDR
The similarity of the genes for factor IX and protein C suggests that they may be the most closely related members of the serine protease gene family involved in coagulation and fibrinolysis. Expand
Large introns in the 3' end of the gene for the pro alpha 1 (IV) chain of human basement membrane collagen.
TLDR
The results suggest that the gene for the pro alpha 1(IV) chain of human basement membrane collagen is significantly larger than the genes for fibrillar collagens and show that it lacks the 54-bp exon repeats characteristic of fibrilar collagen genes. Expand
...
1
2
3
4
5
...