A major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes is described and is expected to be a useful platform for functional annotation of newlysequenced genomes, including those of complex eukARYotes, and genome-wide evolutionary studies.
An updated analysis of the evolutionary relationships between CRISPR–Cas systems and Cas proteins is provided and a 'polythetic' classification that integrates the phylogenies of the most common cas genes, the sequence and organization of theCRISPR repeats and the architecture of the CRISpr–cas loci is proposed.
The use of composition-based statistics is particularly beneficial for large-scale automated applications of PSI-BLAST, and the use, for each database sequence, of a position-specific scoring system tuned to that sequence's amino acid composition.
An approach combining the analysis of signature protein families and features of the architecture of cas loci that unambiguously partitions most CRISPR–cas loci into distinct classes, types and subtypes is presented.
The distribution of nucleotide specificity among the proteins of the G TPase superclass indicates that the common ancestor of the entire superclass was a GTPase and that a secondary switch to ATPase activity has occurred on several independent occasions during evolution.
Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive gene loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.
An update of the Clusters of Orthologous Groups of proteins, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level are presented.
It appears most likely that CASS is a prokaryotic system of defense against phages and plasmids that functions via the RNAi mechanism, which seems to involve integration of fragments of foreign genes into archaeal and bacterial chromosomes yielding heritable immunity to the respective agents.
Functional and evolutionary patterns in the recently constructed set of 5,873 clusters of predicted orthologs (eukaryotic orthologous groups or KOGs) from seven eukaryosis genomes are examined, revealing a conserved core of largely essential eukARYotic genes as well as major diversification and innovation associated with evolution of eUKaryotic genomes.
Comparison of C. acetobutylicum to Bacillus subtilis reveals significant local conservation of gene order, which has not been seen in comparisons of other genomes with similar, or, in some cases closer, phylogenetic proximity.