Extraction of chemical structures and reactions from the literature

@inproceedings{Lowe2012ExtractionOC,
  title={Extraction of chemical structures and reactions from the literature},
  author={Daniel M. Lowe},
  year={2012}
}
........................................................................................................................................ II Table of 

Figures, Tables, and Topics from this paper

ChemScanner: extraction and re-use(ability) of chemical information from common scientific documents containing ChemDraw files
TLDR
The ChemScanner project aims to support the chemists in their efforts to re-use chemistry research data by providing them missing tools for an automated assembly of reaction data. Expand
SURVEY ON INFORMATION EXTRACTION FROM CHEMICAL COMPOUND LITERATURES : TECHNIQUES AND CHALLENGES
Chemical documents, especially those involving drug information, comprise a variety of types – the most common being journal articles, patents and theses. They typically contain large amounts ofExpand
Extraction of Reactions from Patents using Grammars
TLDR
LeadMine is used to recognize chemicals and physical quantities using a grammar and these entities are used with ChemicalTagger’s phrase grammar to determine the relationship between chemicals and reaction properties. Expand
Big Data from Pharmaceutical Patents: A Computational Analysis of Medicinal Chemists' Bread and Butter.
TLDR
This study used a sophisticated text-mining pipeline to extract 1.15 million unique whole reaction schemes, including reaction roles and yields, from pharmaceutical patents, and found that today's typical product molecule is larger, more hydrophobic, and more rigid than 40 years ago. Expand
A review of optical chemical structure recognition tools
TLDR
This review provides an overview of all methods and tools that have been published in the field of OCSR and some of the latest approaches are based on deep neural networks (DNN). Expand
Automatic identification of relevant chemical compounds from patents
TLDR
An automated system that extracts chemical entities from patents and classifies their relevance with high performance is designed, which enables the extension of the Reaxys database by means of automation. Expand
The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS
TLDR
The development of a pipeline for the automated extraction and annotation of chemical data from published PATENTS showed that models developed using data collected from PATENTS had similar or better prediction accuracy compared to the highly curated data used in previous publications. Expand
Computational Chemical Synthesis Analysis and Pathway Design
TLDR
This review aims to summarize the developments of computer-assisted synthetic analysis and design in recent years, and how machine-learning algorithms contributed to them. Expand
BioNavi-NP: Biosynthesis Navigator for Natural Products
TLDR
BioNavi-NP, a navigable and user-friendly toolkit, capable of predicting the biosynthetic pathways for NPs and NP-like compounds through a novel (AND-OR Tree)-based planning algorithm, an enhanced molecular Transformer neural network, and a training set that combines general organic transformations and biosynthetics steps. Expand
The Future of Chemical Information Is Now
TLDR
It is believed that, in the short term, efforts are needed to expand awareness and training of skill-sets needed to get the greatest value out of the available data. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 103 REFERENCES
Automated Information Extraction and Structure-Activity Relationship Analysis of Cytochrome P450 Substrates
TLDR
Although the previous classification model was developed using data from only 161 compounds, the model classified the substrates found by text-mining analysis with reasonable accuracy and confirmed the validity of both the multi-objective classification model for CYP substrates and the text- mining procedure. Expand
Revised Section F: Natural products and related compounds
The nomenclature of natural products has suffered from much confusion, mostly for historical reasons. The isolation of a new substance, in the early days of the science, generally preceded itsExpand
Nomenclature of carbohydrates (IUPAC Recommendations 1996)
  • A. McNaught
  • Chemistry, Medicine
  • Advances in carbohydrate chemistry and biochemistry
  • 1997
TLDR
Chairmen: H. Dixon (UK), J. F. Vliegenthart (Netherlands), A. Cornish-Bowden (France) and M. Venetianer (Hungary); Secretaries: A. Karlson (Germany), C. Li2becq (Belgium), K. Reedijk (netherlands); Members: J. Tipton (Ireland), S. Velick (USA), P. Vejer (USA). Expand
Extension and revision of the von Baeyer system for naming polycyclic compounds (including bicyclic compounds)
TLDR
These recommendations document the von Baeyer system for naming polycyclic ring systems described in Rules A-31, A-32 and B-14 and extend the system to cover more complex cases. Expand
Automated Extraction of Information from the Literature on Chemical-CYP3A4 Interactions
TLDR
A text mining system that extracts information on chemical-CYP3A4 interactions using a simple but effective pattern matching method based on the order of three keywords will be applicable to interactions of chemicals with any functional proteins, such as enzymes and transporters, simply by changing the list of key verbs. Expand
The importance of green chemistry in process research and development.
  • P. Dunn
  • Chemistry, Medicine
  • Chemical Society reviews
  • 2012
TLDR
A rosuvastatin intermediate is produced using a deoxy ribose aldolase (DERA) enzyme in which two carbon-carbon bonds and two chiral centres are formed in the same process step. Expand
Extraction of CYP Chemical Interactions from Biomedical Literature Using Natural Language Processing Methods
TLDR
A system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods, using a maximum entropy based learning method. Expand
Improving the quality of published chemical names with nomenclature software.
TLDR
Criteria for classification of systematic names in terms of quality/correctness are discussed and applied to a sample set of several hundred names extracted from the literature. Expand
Identification of Chemical Entities in Patent Documents
TLDR
A chemical entity recognizer that uses a machine learning approach based on conditional random fields (CRF) and compare the performance with dictionary-based approaches using several terminological resources is presented. Expand
ChEBI: a database and ontology for chemical entities of biological interest
TLDR
A dictionary of molecular entities focused on ‘small’ chemical compounds and an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified. Expand
...
1
2
3
4
5
...