Many InChIs and quite some feat

@article{Warr2015ManyIA,
  title={Many InChIs and quite some feat},
  author={Wendy A. Warr},
  journal={Journal of Computer-Aided Molecular Design},
  year={2015},
  volume={29},
  pages={681-694}
}
  • W. Warr
  • Published 17 June 2015
  • Computer Science
  • Journal of Computer-Aided Molecular Design
Fifteen years have passed since the International Union of Pure and Applied Chemistry (IUPAC) international chemical identifier (InChI) project [1–8] was initiated in 2000. The increasing complexity of molecular structures was making conventional naming procedures inconvenient, and there was no suitable, openly available electronic format for linking chemical structures over the Internet. So, InChI was developed as a freely available, non-proprietary identifier for chemical substances that can… 

InChI version 1.06: now more than 99.99% reliable

The software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable, and the details of the improvements in the v1.06 release are reported, which introduces significant new features, including support for pseudo-element atoms and an improved description of polymers.

A possible extension to the RInChI as a means of providing machine readable process data

A possible extension to the IUPAC RInChI standard via an auxiliary layer, termed ProcAuxInfo, which is a standardised, extensible form in which to report certain key reaction parameters such as declaration of all products and reactants as well as auxiliaries known in the reaction, reaction stoichiometry, amounts of substances used, conversion, yield and operating conditions is presented.

Information Retrieval and Text Mining Technologies for Chemistry.

This Review provides a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting information demands of chemical information contained in scientific literature, patents, technical reports, or the web.

Molecular representations in AI-driven drug discovery: a review and practical guide

This review presents some of the most popular electronic molecular and macromolecular representations used in drug discovery, many of which are based on graph representations, and describes applications of these representations in AI-driven drug discovery.

Annotation of Peptide Structures Using SMILES and Other Chemical Codes–Practical Solutions

This publication discusses the generation of SMILES representations of peptides using existing software and brief recommendations for training of staff working on peptide annotations, are discussed.

Translating the InChI: adapting neural machine translation to predict IUPAC names from a chemical identifier

The model performed particularly well on organics, with the exception of macrocycles, and was comparable to commercial IUPAC name generation software, while the predictions were less accurate for inorganic and organometallic compounds.

Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules—Search Options and Applications in Food Science

Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components and tools annotating enzymes and metabolism) and some problems associated with database application are discussed.

Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process

The study shows that such an approach, referred to as prediction-driven MMP analysis, is a useful tool for medicinal chemists, allowing identification of large numbers of “interesting” transformations that can be used to drive the molecular optimization process.

QUANTIS: Data quality assessment tool by clustering analysis

The demonstration of QUANTIS on three different pyrolysis cases showed that it can help in identifying and overcoming instabilities in experimental datasets, reduce mass and molar balance closure discrepancies, and, by evaluating the visualized correlation matrices, increase understanding in the underlying reaction pathways.

References

SHOWING 1-10 OF 279 REFERENCES

International chemical identifier for chemical reactions

An open-access software for creating a unique, text-based identifier for reactions (RInChI) was developed by the Goodman group at the University of Cambridge, based on the IUPAC International

UniChem: extension of InChI-based compound mapping to salt, connectivity and stereochemistry layers

This work describes how the layered structural representation of the Standard InChI is exploited to create new functionality within UniChem that integrates these related molecular forms.

yaInChI: Modified InChI string scheme for line notation of chemical structures

The modifications to yaInChI provide non-rotatable single bonds, stereochemistry of organometallic compounds, allene and cumulene, and parity of atoms with a lone pair, making it a promising solution for handling large chemical structure databases.

UniChem: a unified chemical structure cross-referencing and identifier tracking system

The recent inclusion of data sources external to EMBL-EBI has provided a simple means of providing users with an even wider selection of resources with which to link to, all at no extra cost, while at the same time providing a simple mechanism for external resources to link through UniChem to all EMBL -EBI chemistry resources.

Data Formats for Elementary Gas-Phase Kinetics: Part 2. Unique Representations of Reactions

A method of extending the IUPAC International Chemical Identifier (InChI) to describe and identify elementary reactions in a standard computer-readable notation is developed. Denoted InChI-ER, the

Indexing molecules with chemical graph identifiers

A modified and further extended version of the molecular equivalence number naming adaptation of the Morgan algorithm for the generation of a chemical graph identifier (CGI) that corrects for the collisions recognized in the original adaptation and includes the ability to deal with graph canonicalization, ensembles (salts), and isomerism in a flexible manner.

Automated systematic nomenclature generation for organic compounds

The capabilities of existing systematic naming software algorithms and tools are reviewed and some of the challenges, limitations, and future challenges for development are reviewed.

Current Status and Future Development in Relation to IUPAC Activities

The IUPAC International Chemical Identifier (InChI) is a non-proprietary, machine-readable chemical structure representation format enabling electronic searching, and interlinking and combining, of

Data Formats for Elementary Gas Phase Kinetics, Part 1: Unique Representations of Species at the Molecular Level

Standardized electronic formats for data are needed to efficiently and transparently communicate the results of scientific studies. A format for the unique identification of chemical species is a

InChIKey collision resistance: an experimental testing

InChIKey is a 27-character compacted (hashed) version of InChI which is intended for Internet and database searching/indexing and is based on an SHA-256 hash of the InChI character string. The first
...