The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy

@article{Guillou2013ThePR,
  title={The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy},
  author={Laure Guillou and Dipankar Bachar and St{\'e}phane Audic and David Bass and C{\'e}dric Berney and Lucie Bittner and Christophe Boutte and Ga{\"e}tan Burgaud and Colomban de Vargas and Johan Decelle and Javier del Campo and John R. Dolan and Micah Dunthorn and Bente Edvardsen and Maria Holzmann and Wiebe H.C.F. Kooistra and Enrique Lara and Noan Le Bescot and Ramiro Logares and Fr{\'e}d{\'e}ric Mah{\'e} and Ramon Massana and Marina Montresor and Raphael Morard and Fabrice Not and Jan Pawlowski and Ian Probert and Anne-Laure Sauvadet and Raffaele Siano and Thorsten Stoeck and Daniel Vaulot and Pascale Zimmermann and Richard Christen},
  journal={Nucleic Acids Research},
  year={2013},
  volume={41},
  pages={D597 - D604}
}
The interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR2, http://ssu-rrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and… 
µgreen-db: a reference database for the 23S rRNA gene of eukaryotic plastids and cyanobacteria
TLDR
A reference database for the 23S rRNA gene, called µgreen-db, is set up, which was able to assign 96% of the sequences of the V domain of the23S r RNA gene obtained by metabarcoding after amplification from soil DNA at the genus level, highlighting good coverage of the database.
EukRef-excavates: seven curated SSU ribosomal RNA gene databases
TLDR
A set of EukRef-curated databases for the excavate protists—a large assemblage that includes numerous taxa with divergent SSU rRNA gene sequences, which are prone to misclassification are presented.
PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic eukaryotes with curated taxonomy
TLDR
The PhytoREF database is built that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages and mainly focuses on marine microalgae, but sequences from land plants and freshwater taxa were also included to broaden the applicability of Phy toREF to different aquatic and terrestrial habitats.
EukRef‐Ciliophora: a manually curated, phylogeny‐based database of small subunit rRNA gene sequences of ciliates
TLDR
The approach included the inference of phylogenetic trees for every ciliate lineage and produced the largest SSU rRNA tree of the phylum Ciliophora to date, which is superior to the current SILVA database in classifying HTS reads from a global marine survey.
Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
TLDR
All metazoan mitochondrial gene sequences from GenBank are retrieved, and quality filtered and formatted the datasets for taxonomic assignments using taxonomic assignment tools, and the mitochondrial Cytochrome oxidase subunit I gene was the most sequence-rich gene.
metaPR2: a database of eukaryotic 18S rRNA metabarcodes with an emphasis on protists
TLDR
A newly-assembled database of processed 18S rRNA metabarcodes that are annotated with the PR2 reference sequence database is presented, called metaPR2, which contains 41 datasets corresponding to more than 4,000 samples and 73,000 ASVs.
dinoref: A curated dinoflagellate (Dinophyceae) reference database for the 18S rRNA gene
TLDR
An updated 18S rRNA reference database of dinoflagellates: dinoref is provided, providing an opportunity to test the level of taxonomic resolution of different 18S barcode markers based on a large number of sequences and species.
EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution
TLDR
EukRef organizes and facilitates rigorous sequence data mining and annotation by providing protocols, guidelines and tools to do so, and develops reliable reference databases across the diversity of microbial eukaryotes.
Investigating Microbial Eukaryotic Diversity from a Global Census: Insights from a Comparison of Pyrotag and Full-Length Sequences of 18S rRNA Genes
TLDR
Operational taxonomic units derived from full-length and pyrotag sequences of 18S rRNA genes from 10 global samples were analyzed and found to provide holistic assessments of protistan communities, although care must be taken in interpreting the results.
pr2-primers: an 18S rRNA primer database for protists
TLDR
A database listing 179 primers and 76 primer pairs that have been used for eukaryotic 18S rRNA metabarcoding and a R-based web application that allows to browse the database, visualize the taxonomic distribution of the amplified sequences with the number of mismatches, and to test any user-defined primer set.
...
...

References

SHOWING 1-10 OF 29 REFERENCES
SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB
TLDR
SILVA (from Latin silva, forest), was implemented to provide a central comprehensive web resource for up to date, quality controlled databases of aligned rRNA sequences from the Bacteria, Archaea and Eukarya domains.
How many novel eukaryotic 'kingdoms'? Pitfalls and limitations of environmental DNA surveys
TLDR
The results suggest that the number of novel higher-level taxa revealed by previously published EES was overestimated, and there is no clear evidence for a spectacular increase of the diversity at the kingdom level.
Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB
TLDR
There is incongruent taxonomic nomenclature among curators even at the phylum level, and environmental sequences were classified into 100 phylum-level lineages in the Archaea and Bacteria.
The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis
The Ribosomal Database Project (RDP-II) provides the research community with aligned and annotated rRNA gene sequences, along with analysis services and a phylogenetically consistent taxonomic
A New Web Server for the Rapid Identification of Microorganisms
TLDR
A new web server for fast and reliable identifi cations of microbial isolates, based on databases of cultured species for Bacteria, Archaea and Protozoa, that allows retrieving related sequences and their taxonomies in order to proceed to phylogenetic analyses.
Eukaryotic Richness in the Abyss: Insights from Pyrotag Sequencing
TLDR
The deep-sea floor appears as a global DNA repository, which preserves genetic information about organisms living in the sediment, as well as in the water column above it, which can be used for future monitoring of past and present environmental changes.
Detection of Introns in Eukaryotic Small Subunit Ribosomal RNA Gene Sequences
TLDR
The gene encoding SSU-rRNA sequences is the tool of choice for phylogenetic analyses and environmental biodiversity analyses of bacteria, Archaea but also unicellular Eukaryota, and descriptions of 3638 such sequences are found.
Evolutionary history of "early-diverging" eukaryotes: the excavate taxon Carpediemonas is a close relative of Giardia.
TLDR
Although diplomonads and retortamonads lack any mitochondria-like organelle, Carpediemonas contains double membrane-bounded structures physically resembling hydrogenosomes, suggesting that it will be valuable in interpreting the evolutionary significance of many molecular and cellular peculiarities of diplomonades.
Depicting more accurate pictures of protistan community complexity using pyrosequencing of hypervariable SSU rRNA gene regions.
TLDR
A fast and efficient strategy to discriminate pyrosequencing signals from noise is suggested in order to more realistically depict the structure of protistan communities using simple tools that are implemented in standard tag data-processing pipelines.
Protistan microbial observatory in the Cariaco Basin, Caribbean. I. Pyrosequencing vs Sanger insights into species richness
TLDR
This large data set provided the first statistically sound prediction of the total size of protistan richness in a large and varied environment, such as the Cariaco Basin: over 36 000 species, defined as almost full-length 18S rRNA gene sequence clusters sharing over 99% sequence homology.
...
...