Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard

  title={Identifying known unknowns using the US EPA’s CompTox Chemistry Dashboard},
  author={Andrew D. McEachran and Jon R. Sobus and Antony J. Williams},
  journal={Analytical and Bioanalytical Chemistry},
AbstractChemical features observed using high-resolution mass spectrometry can be tentatively identified using online chemical reference databases by searching molecular formulae and monoisotopic masses and then rank-ordering of the hits using appropriate relevance criteria. The most likely candidate “known unknowns,” which are those chemicals unknown to an investigator but contained within a reference database or literature source, rise to the top of a chemical list when rank-ordered by the… 
Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns
The generation en masse of predicted MS/MS spectra for the entirety of the US EPA’s DSSTox database is described using competitive fragmentation modelling and a freely available open source tool, CFM-ID.
Open Science for Identifying "Known Unknown" Chemicals.
Challenges facing comprehensive suspect screening include increasing chemicals of interest, as well as ever-decreasing detection limits, leading to increased false positives, and open science is poised to play a pivotal role in the evolution of suspect screening.
Big Free-Access Chemical Databases in Non-Target Mass Spectrometry Analysis
The ChemSpider and PubChem chemical databases used in non-target mass spectrometry analysis in order to outline the candidates for identification are described. Relevant compounds are searched by
Using the US EPA CompTox Chemicals Dashboard to interpret targeted and non-targeted GC–MS analyses from human breath and other biological media
Specific procedures using the Dashboard as a first-stop tool for exploring both targeted and non-targeted results from GC–MS analyses of chemicals found in breath, exhaled breath condensate, and associated aerosols are described.
The CompTox Chemistry Dashboard: a community data resource for environmental chemistry
The U.S. Environmental Protection Agency’s web-based CompTox Chemistry Dashboard is addressing needs by integrating diverse types of relevant domain data through a cheminformatics layer, built upon a database of curated substances linked to chemical structures.
In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples
The abilities of in silico spectra are shown to correctly identify true positives in complex samples (at rates comparable to those observed with reference spectra), and efficiently filter large numbers of potential false positives from further consideration.
“MS-Ready” structures for non-targeted high-resolution mass spectrometry screening studies
The workflow for the generation and linking of ~ 700,000 MS-Ready structures as well as download, search and export capabilities to serve structure identification using HRMS are described.
Revisiting Five Years of CASMI Contests with EPA Identification Tools
The results suggest that Dashboard data and tools would enhance chemical identification capabilities for practitioners of HRMS-based NTA, and an in-depth review of the CASMI structure sets made these reviewed sets available via the Dashboard.


Identification of “Known Unknowns” Utilizing Accurate Mass Data and ChemSpider
These approaches were shown to be successful in identifying “known unknowns” noted in the laboratory and for compounds of interest to others.
Identification of “Known Unknowns” Utilizing Accurate Mass Data and Chemical Abstracts Service Databases
These approaches were shown to be successful in identifying “known unknowns” noted in LC-MS and even GC-MS analyses in the laboratory and were demonstrated in the identification of a variety of compounds of interest to others.
Is nontarget screening of emerging contaminants by LC-HRMS successful? A plea for compound libraries and computer tools
The advantages and future needs of publicly available MS and MS/MS reference databases and libraries which have mostly been created for the metabolomic field are discussed and the availability of comprehensive MS libraries with a focus on environmental contaminants would tremendously improve the situation.
In silico fragmentation for computer assisted identification of metabolite mass spectra
A method that is able to identify small molecules from tandem MS measurements, even without spectral reference data or a large set of fragmentation rules is presented.
Identifying small molecules via high resolution mass spectrometry: communicating confidence.
A level system is proposed, which arose from intense discussions within the department, to ease the communication of identification confidence and form the basis of further discussions on this topic, and specifically covers the new possibilities in HR-MS-based analysis.
MassBank: a public repository for sharing mass spectral data for life sciences.
MassBank is the first public repository of mass spectra of small chemical compounds for life sciences and provides a merged spectrum for each compound prepared by merging the analyzed ESI-MS(2) data on an identical compound under different collision-induced dissociation conditions.
PubChem Substance and Compound databases
An overview of the PubChem Substance and Compound databases is provided, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access.
Linking high resolution mass spectrometry data with exposure and toxicity forecasts to advance high-throughput environmental monitoring.
Non-target screening with high-resolution mass spectrometry: critical review using a collaborative trial on water analysis
A dataset from a collaborative non-target screening trial organised by the NORMAN Association is used to review the state-of-the-art and discuss future perspectives of non- target screening using high-resolution mass spectrometry in water analysis.