A complete domain-to-species taxonomy for Bacteria and Archaea

  title={A complete domain-to-species taxonomy for Bacteria and Archaea},
  author={Donovan H. Parks and Maria S. Chuvochina and Pierre-Alain Chaumeil and Christian Rinke and Aaron J. Mussig and P. Bernt Hugenholtz},
  journal={Nature Biotechnology},
  pages={1 - 8}
The Genome Taxonomy Database is a phylogenetically consistent, genome-based taxonomy that provides rank-normalized classifications for ~150,000 bacterial and archaeal genomes from domain to genus. However, almost 40% of the genomes in the Genome Taxonomy Database lack a species name. We address this limitation by using commonly accepted average nucleotide identity criteria to set bounds on species and propose species clusters that encompass all publicly available bacterial and archaeal genomes… 

A standardized archaeal taxonomy for the Genome Taxonomy Database.

A standardized archaeal taxonomy is proposed that is derived from a 122-concatenated-protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence and is shown to robustly correct for substitution rates varying up to 30-fold using simulated datasets.

Resolving widespread incomplete and uneven archaeal classifications based on a rank-normalized genome-based taxonomy

A standardized archaeal taxonomy is proposed, as part of the Genome Taxonomy Database (GTDB), derived from a 122 concatenated protein phylogeny that resolves polyphyletic groups and normalizes ranks based on relative evolutionary divergence (RED).

Naming the unnamed: over 65,000 Candidatus names for unnamed Archaea and Bacteria in the Genome Taxonomy Database.

This work exploits an approach to the generation of well-formed arbitrary Latinate names at a scale sufficient to name tens of thousands of unnamed taxa within GTDB.

GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy

Prokaryotic diversity from the perspective of the GTDB is explored and the importance of metagenome-assembled genomes in expanding available genomic representation is highlighted and the use of average nucleotide identities as a pragmatic approach for delineating proKaryotic species is discussed.

It is time for a new type of type to facilitate naming the microbial world

Diversity, function and evolution of marine microbe genomes

The database provides a comprehensive resource for marine microbiome, which would be a valuable reference for studies of marine life origination and evolution, ecology monitor and protection, bioactive compound development.

Functional and evolutionary significance of unknown genes from uncultivated taxa

A global multi-habitat dataset is analyzed and 980 previously neglected protein families that can accurately distinguish entire uncultivated phyla, classes, and orders are found, likely representing synapomorphic traits that fostered their divergence.

Roadmap for naming uncultivated Archaea and Bacteria

The authors discuss the issue of naming uncultivated prokaryotic microorganisms, which currently do not have a formal nomenclature system due to a lack of type material or cultured representatives, and propose two recommendations including the recognition of DNA sequences as type material.



A Genus Definition for Bacteria and Archaea Based on a Standard Genome Relatedness Index

Genetic coherence is an emergent property of genera in Bacteria and Archaea that relies on the combined use of the average nucleotide identity, genome alignment fraction, and the distinction between type- and non-type species in this study.

A genus definition for Bacteria and Archaea based on genome relatedness and taxonomic affiliation

Results show that a distinct difference between distant relatives and close relatives at the genome level (i.e., genomic coherence) is an emergent property of genera in Bacteria and Archaea.

A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life

This work used a concatenated protein phylogeny as the basis for a bacterial taxonomy that conservatively removes polyphyletic groups and normalizes taxonomic ranks on the basis of relative evolutionary divergence.

Towards a Genome-Based Taxonomy for Prokaryotes

The AAI-based approach provides a means to evaluate the robustness of alternative genetic markers for phylogenetic purposes, and could contribute significantly to a genome-based taxonomy for all microbial organisms.

Microbial species delineation using whole genome sequences

This work demonstrates that the combination of gANI and the alignment fraction between two genomes accurately reflects their genomic relatedness, and proposes this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, to be used to address previous inconsistencies in species classification.

An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea

A ‘taxonomy to tree’ approach for transferring group names from an existing taxonomy to a tree topology is developed and used to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408 315 sequences.

Genomic insights that advance the species definition for prokaryotes.

The average nucleotide identity of the shared genes between two strains was found to be a robust means to compare genetic relatedness among strains, and that ANI values of approximately 94% corresponded to the traditional 70% DNA-DNA reassociation standard of the current species definition.

High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries

FastANI is developed, a method to compute ANI using alignment-free approximate sequence mapping, and it is shown 95% ANI is an accurate threshold for demarcating prokaryotic species by analyzing about 90,000 proKaryotic genomes.

1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space.

Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains Bacteria and Archaea

A reference phylogeny of 10,575 evenly-sampled bacterial and archaeal genomes, based on a comprehensive set of 381 markers, is built, providing an updated view of domain-level relationships between Archaea and Bacteria.