Mining data from legacy taxonomic literature and application for sampling spiders of the Teutamus group (Araneae; Liocranidae) in Southeast Asia

  title={Mining data from legacy taxonomic literature and application for sampling spiders of the Teutamus group (Araneae; Liocranidae) in Southeast Asia},
  author={Francisco Andr{\'e}s Rivera-Quiroz and Booppa Petcharad and Jeremy A. Miller},
  journal={Scientific Reports},
Taxonomic literature contains information about virtually ever known species on Earth. In many cases, all that is known about a taxon is contained in this kind of literature, particularly for the most diverse and understudied groups. Taxonomic publications in the aggregate have documented a vast amount of specimen data. Among other things, these data constitute evidence of the existence of a particular taxon within a spatial and temporal context. When knowledge about a particular taxonomic… 
4 Citations
Using taxonomic treatments to assess an author's career: the impactful Jocélia Grazia.
Here we present a descriptive analysis of the bibliographic production of the world-renowned heteropterist Dr. Jocélia Grazia and comments on her taxonomic reach based on extracted taxonomic
Automating the Curation Process of Historical Literature on Marine Biodiversity Using Text Mining: The DECO Workflow
This work orchestrates IE tools and provides the curators with a unified view of the methodology; as a result the documentation of the strengths, limitations and dependencies of several tools was drafted.
A Local Discrete Text Data Mining Method in High-Dimensional Data Space
  • Juan Li, Aiping Chen
  • Computer Science
    International Journal of Computational Intelligence Systems
  • 2022
The simulation experiment results show that the method effectively reduces the time and improves the accuracy of data mining, where it also consumes less memory, indicating that the multi-objective optimization method can effectively solve multiple problems and effectively improve the data mining effect.
Biodiversity Change: Past, Present, and Future
  • A. Purvis, F. Isbell
  • Environmental Science
    The Ecological and Societal Consequences of Biodiversity Loss
  • 2022


Integrating and visualizing primary data from prospective and legacy taxonomic literature
It is demonstrated here that XML markup using GoldenGATE can address the challenge presented by unstructured legacy data, can extract structured primary biodiversity data which can be aggregated with and jointly queried with data from other Darwin Core-compatible sources, and show how visualization of these data can communicate key information contained in biodiversity literature.
Current GBIF occurrence data demonstrates both promise and limitations for potential red listing of spiders
The potential of GBIF data to serve as an additional source of information for conservation assessments, complementing literature data, but not particularly useful on its own as it stands right now for spiders is demonstrated.
Taxonomic information exchange and copyright: the Plazi approach
The information found in Plazi's databases – taxonomic treatments as well as the metadata of the publications – are in the public domain and can therefore be used for further scientific research without any restriction, whether or not contained in copyrighted publications.
EJT editorial standard for the semantic enhancement of specimen data in taxonomy literature
The guidelines stipulate controlled vocabularies and precise formats for presenting the specimens examined within a taxonomic publication, which allow for the rich data associated with the primary research material to be harvested, distributed and interlinked online via international biodiversity data aggregators.
A DNA barcode-assisted annotated checklist of the spider (Arachnida, Araneae) communities associated to white oak woodlands in Spanish National Parks
Molecular data confirmed putative new species with diagnosable morphology, identified overlooked lineages that may constitute new species, confirmed assignment of specimens of unknown sexes to species and identified cases of misidentifications and phenotypic polymorphisms.
Utilizing online resources for taxonomy: a cybercatalog of Afrotropical apiocerid flies (Insecta: Diptera: Apioceridae)
A cybercatalog to the Apioceridae (apiocerid flies) of the Afrotropical Region is provided, which includes links to open-access, online repositories to access taxonomic information, digitized literature, morphological descriptions, specimen occurrence data, and images.
Taxonomic bias in biodiversity data and societal preferences
Results show that societal preferences, rather than research activity, strongly correlate with taxonomic bias, which lead to assert that scientists should advertise less charismatic species and develop societal initiatives (e.g. citizen science) that specifically target neglected organisms.
Entomological knowledge in Madagascar by GBIF datasets: estimates on the coverage and possible biases (Insecta)
Current Protected Areas’ (PAs) network covers about the 70% of the total of the collecting localities for the nine insect orders considered, even though some, such as Trichoptera, Odonata, and Neuroptera seem significantly less protected than others.
The spider tree of life: phylogeny of Araneae based on target‐gene analyses from an extensive taxon sampling
We present a phylogenetic analysis of spiders using a dataset of 932 spider species, representing 115 families (only the family Synaphridae is unrepresented), 700 known genera, and additional
Amazonian biodiversity: assessing conservation priorities with taxonomic data
Museum collections can play a vital role in identifying species-rich areas for potential conservation in Amazonia, but a concerted and structured effort to increase the number and distribution of collections is needed to take maximum advantage of the information they contain.