The taxonomic name resolution service: an online tool for automated standardization of plant names

@article{Boyle2012TheTN,
  title={The taxonomic name resolution service: an online tool for automated standardization of plant names},
  author={Brad L. Boyle and Nicole Hopkins and Zhenyuan Lu and Juan Antonio Raygoza Garay and Dmitry Y. Mozzherin and Tony Rees and Naim Matasci and Martha L. Narro and William H. Piel and Sheldon J. McKay and Sonya J. Lowry and Chris Freeland and Robert K. Peet and Brian J. Enquist},
  journal={BMC Bioinformatics},
  year={2012},
  volume={14},
  pages={16 - 16}
}
BackgroundThe digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of… 
A review of software tools for spell‐checking taxon names in vegetation databases
TLDR
Six software tools that spell-check taxon names in vegetation databases are reviewed and the Global Names Resolver emerged as the most versatile software tool.
A Standardized Reference Data Set for Vertebrate Taxon Name Resolution
TLDR
A carefully human-vetted analysis of 1000 verbatim scientific names taken at random from those published via the data aggregator VertNet, providing the first rigorously reviewed, reference validation data set.
Retrieving taxa names from large biodiversity data collections using a flexible matching workflow
Match Algorithms for Scientific Names in FlorItaly, the Portal to the Flora of Italy
TLDR
A near match algorithm to resolve misspelled scientific names has been integrated in the query systems of FlorItaly and a novel tool, capable of rapidly aligning any list of names to the nomenclatural backbone provided by the national checklists, has been developed.
Controlling the taxonomic variable: Taxonomic concept resolution for a southeastern United States herbarium portal
Overview. Taxonomic names are imperfect identifiers of specific and sometimes conflicting taxonomic perspectives in aggregated biodiversity data environments. The inherent ambiguities of names can be
Challenges with using names to link digital biodiversity information
TLDR
The name-strings in GenBank, Catalogue of Life (CoL), and the Dryad Digital Repository are compared to assess the effectiveness of the current names-management toolkit developed by Global Names to achieve interoperability among distributed data sources.
The use and limits of scientific names in biological informatics
TLDR
A lack of consistency in the cardinal relationship between names and taxa places limits on how scientific names may be used in biological informatics in initially anchoring, and in the subsequent retrieval and integration, of relevant biodiversity information.
Taxamatch, an Algorithm for Near (‘Fuzzy’) Matching of Scientific Names in Taxonomic Databases
  • T. Rees
  • Computer Science
    PloS one
  • 2014
TLDR
Taxamatch is described, an improved name matching solution for this information domain that employs a custom Modified Damerau-Levenshtein Distance algorithm in tandem with a phonetic algorithm, together with a rule-based approach incorporating a suite of heuristic filters, to produce improved levels of recall, precision and execution time over the existing dynamic programming algorithms n-grams and standard edit distance.
WorldFlora: An R package for exact and fuzzy matching of plant names against the World Flora Online taxonomic backbone data
  • R. Kindt
  • Environmental Science
    bioRxiv
  • 2020
TLDR
WorldFlora offers a straightforward pipeline for semi-automatic plant name checking by matching lists of plant names with a static copy from World Flora Online, an ongoing global effort of completing an online flora of all known vascular plants and bryophytes by 2020.
taxadb: A high‐performance local taxonomic database interface
TLDR
Taxadb R package is presented which creates a local database, managed automatically from within R, to provide fast operations on millions of taxonomic names, and provides access to established naming authorities to resolve synonyms, taxonomic identifiers, and hierarchical classification in a consistent and intuitive data format.
...
...

References

SHOWING 1-10 OF 78 REFERENCES
5 On the Use of Taxonomic Concepts in Support of Biodiversity Research and Taxonomy
Future biodiversity research will make increased use of distributed data networks, scientific workflows, and powerful mechanisms for resolving a broad spectrum of primary data. This paper outlines
LINNAEUS: A species name identification system for biomedical literature
TLDR
LINNAEUS is an open source, stand-alone software system capable of recognizing and normalizing species name mentions with speed and accuracy, and can be integrated into a range of bioinformatics and text-mining applications.
Perspectives: Towards a language for mapping relationships among taxonomic concepts
TLDR
A comprehensive and powerful language for representing the relationships among taxonomic concepts, which will facilitate a more precise documentation of similarities and differences in multiple succeeding taxonomic perspectives, thereby preparing the stage for an ontology‐based integration of taxonomic and related biological information.
Towards a collaborative, global infrastructure for biodiversity assessment
TLDR
How web-based georeferencing tools that utilize best practices and gazetteer databases can be employed to improve geographic data quality and taxonomic data quality will help transform today's portals to raw biodiversity data into nexuses of collaborative creation and sharing of biodiversity knowledge is discussed.
GenBank
TLDR
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for over 340 000 formally described species and integrates these records with a variety of other data including taxonomy nodes, genomes, protein structures, and biomedical journal literature in PubMed.
The New Taxonomy
TLDR
Introductory: Towards the New Taxonomy, Q.D. Wheeler Networks and Their Role in e-Taxonomy, M.J. Scoble Taxonomy as a Team Sport, and Understanding Morphology in Systematic Contexts: Three- Dimensional Specimen Ordination and Recognition.
Towards integrative taxonomy
TLDR
Seven guidelines are proposed to help integrative taxonomists recognize cases when species are supported by broad biological evidence and therefore are deserving of an official name and to prevent the over-abundance of both synonyms and names of doubtful application from worsening.
VegBank – a permanent, open-access archive for vegetation-plot data
Rapid progress is being made in North American vegetation science through recent developments within the U.S. National Vegetation Classification (USNVC). Central to these advances are sharing,
Forest Inventory and Analysis Database of the United States of America (FIA)
Extensive vegetation inventories established with a probabilistic design are an indispensable tool in describing distributions of species and community types and detecting changes in composition in
...
...