Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea

@article{Koonin1997ComparisonOA,
  title={Comparison of archaeal and bacterial genomes: computer analysis of protein sequences predicts novel functions and suggests a chimeric origin for the archaea},
  author={E. Koonin and A. Mushegian and Michael Y. Galperin and D. R. Walker},
  journal={Molecular Microbiology},
  year={1997},
  volume={25}
}
Protein sequences encoded in three complete bacterial genomes, those of Haemophilus influenzae, Mycoplasma genitalium and Synechocystis sp., and the first available archaeal genome sequence, that of Methanococcus jannaschii, were analysed using the blast2 algorithm and methods for amino acid motif detection. Between 75% and 90% of the predicted proteins encoded in each of the bacterial genomes and 73% of the M. jannaschii proteins showed significant sequence similarity to proteins from other… Expand
The COG database: a tool for genome-scale analysis of protein functions and evolution
TLDR
The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes. Expand
Evolutionary genomics of archaeal viruses: unique viral genomes in the third domain of life.
TLDR
It is concluded that crenarchaeal viruses are, in general, evolutionarily unrelated to other known viruses and, probably, evolved via independent accretion of genes derived from the hosts and, through more complex routes of horizontal gene transfer, from other prokaryotes. Expand
Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell.
TLDR
Comparative analysis of the protein sequences encoded in the four euryarchaeal species whose genomes have been sequenced completely revealed 1326 orthologous sets, of which 543 are represented in all four species, and previously undetected orthologs in bacteria and eukaryotes were identified. Expand
Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea
TLDR
The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions and suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaealing hyperthermophiles. Expand
The Deep Archaeal Roots of Eukaryotes
TLDR
A comprehensive set of 355 eukaryotic genes of apparent archaeal origin identified through ortholog detection and phylogenetic analysis is described and it is indicated that, for the majority of these genes, the preferred tree topology is one with the eUKaryotic branch placed outside the extant diversity of archaea. Expand
Protein Phylogenies and Signature Sequences: A Reappraisal of Evolutionary Relationships among Archaebacteria, Eubacteria, and Eukaryotes
  • Radhey S. Gupta
  • Biology, Medicine
  • Microbiology and Molecular Biology Reviews
  • 1998
TLDR
Evidence from indels supports the view that the archaebacteria probably evolved from gram-positive bacteria and suggests that this evolution occurred in response to antibiotic selection pressures, and an alternative model of microbial evolution based on the use of indels of conserved proteins and the morphological features of prokaryotic organisms is proposed. Expand
What are archaebacteria: life's third domain or monoderm prokaryotes related to Gram‐positive bacteria? A new proposal for the classification of prokaryotic organisms
  • Radhey S. Gupta
  • Biology, Medicine
  • Molecular microbiology
  • 1998
TLDR
The hypothesis that archaebacteria and eukaryotes shared a common ancestor exclusive of eubacteria is not supported and evidence is provided for an alternate view of the evolutionary relationship among living organisms that is different from the currently popular three‐domain proposal. Expand
Horizontal Transfer of Archaeal Genes into the Deinococcaceae: Detection by Molecular and Computer-Based Approaches
TLDR
Compared to the total number of ORFs in the genome, those that can be identified as having been acquired from Archaea or Eukaryotes are relatively few (approximately 1%), suggesting that interdomain transfer is rare. Expand
Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs)
TLDR
Special-purpose databases organized on the basis of phylogenetic analysis and carefully curated with respect to known and predicted protein functions provide for a significant improvement in genome annotation. Expand
Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world
TLDR
The prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of proKaryotic evolution. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 89 REFERENCES
Novel protein families in archaean genomes.
TLDR
It is shown that the putative laminin receptor family of eukaryotes and an archaean homologue belong to the previously characterized ribosomal protein family S2 from eubacteria, suggesting that archaea seem to have a mode of expression of genetic information rather similar to eUKaryotes, while eub bacteria may have proceeded into unique ways of transcription and translation. Expand
Sequence similarity analysis of Escherichia coli proteins: functional and evolutionary implications.
  • E. Koonin, R. Tatusov, K. Rudd
  • Biology, Medicine
  • Proceedings of the National Academy of Sciences of the United States of America
  • 1995
TLDR
It is concluded that bacterial protein sequences generally are highly conserved in evolution, with about 50% of all ACR-containing protein families represented among the E. coli gene products. Expand
Phylogenetic analysis of 70 kD heat shock protein sequences suggests a chimeric origin for the eukaryotic cell nucleus
TLDR
To explain the phylogenies based on different gene sequences, a chimeric model for the origin of the eukaryotic cell nucleus involving fusion between an archaebacterium and a Gram-negative eubacterium is proposed. Expand
A dnaK homolog in the archaebacterium Methanosarcina mazei S6.
TLDR
The gene described here is proposed to be the first member of the dnaK family sequenced from the archaebacterial kingdom (Archaea) and confirms that DnaK proteins are highly conserved, occurring not only in eubacteria and eukaryotes (Eucaria), but also in Archaea. Expand
Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae.
TLDR
The entire genome of the bacterium Mycoplasma pneumoniae M129 has been sequenced and a functional classification to a large number of ORFs is tentatively assigned and the biochemical and physiological properties of this bacterium are deduced. Expand
Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli
TLDR
By comparing proteins encoded by the two bacterial genomes, it is shown that extensive gene shuffling and variation in the extent of gene paralogy are major trends in bacterial evolution; this comparison has also allowed us to deduce crucial aspects of the largely uncharacterized metabolism of H. influenzae. Expand
Protein-based phylogenies support a chimeric origin for the eukaryotic genome.
TLDR
The hypothesis of a chimeric origin for the eukaryotic cell nucleus formed from the fusion of an archaebacteria and a gram-negative bacteria is supported. Expand
Sequencing and analysis of bacterial genomes
TLDR
Sequence comparisons show that the most bacterial proteins are highly conserved in evolution, allowing predictions to be made about the functions of most products of an uncharacterized genome. Expand
Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions.
  • T. Kaneko, S. Sato, +21 authors S. Tabata
  • Medicine, Biology
  • DNA research : an international journal for rapid publication of reports on genes and genomes
  • 1996
The sequence determination of the entire genome of the Synechocystis sp. strain PCC6803 was completed. The total length of the genome finally confirmed was 3,573,470 bp, including the previouslyExpand
Computer analysis of bacterial haloacid dehalogenases defines a large superfamily of hydrolases with diverse specificity. Application of an iterative approach to database search.
TLDR
It is shown that bacterial haloacid dehalogenases (HADs) belong to a large superfamily of hydrolases with diverse substrate specificity and many of the proteins with known enzymatic activities in the HAD superfamily are involved in detoxification of xenobiotics or metabolic by-products. Expand
...
1
2
3
4
5
...