Predicting Retrosynthetic Reaction using Self-Corrected Transformer Neural Networks

  title={Predicting Retrosynthetic Reaction using Self-Corrected Transformer Neural Networks},
  author={Shuangjia Zheng and Jiahua Rao and Zhongyue Zhang and Jun Xu and Yuedong Yang},
  journal={Journal of chemical information and modeling},
Synthesis planning is the process of recursively decomposing target molecules into available precursors. Computer-aided retrosynthesis can potentially assist chemists in designing synthetic routes, but at present it is cumbersome and can't provide results of satisfactory qualities. In this study, we have developed a template-free self-corrected retrosynthesis predictor (SCROP) to predict retrosynthesis by using Transformer neural networks. In the method, the retrosynthesis planning was… 

Figures and Tables from this paper

Learning Discrete Neural Reaction Class to Improve Retrosynthesis Prediction
Computer-aided retrosynthesis accelerate and innovate the process of molecule and material design, allowing the discovery of new pathways and automating part of the overall development process for
Molecular Graph Enhanced Transformer for Retrosynthesis Prediction
A Graph Enhanced Transformer (GET) framework, which adopts both the sequential and graphical information of molecules, and significantly outperforms the vanilla Transformer model in test accuracy is proposed.
State-of-the-art augmented NLP transformer models for direct and single-step retrosynthesis
It is shown that data augmentation, which is a powerful method used in image processing, eliminated the effect of data memorization by neural networks and improved their performance for prediction of new sequences.
Evaluation Metrics for Single-Step Retrosynthetic Models
It is shown that it is possible to train a transformer-based retrosynthetic model, reaching a round-trip accuracy of 82.4%, while covering 96.4% of the reactions.
Deep learning in retrosynthesis planning: datasets, models and tools
This review comprehensively summarize the development process of retrosynthesis in the context of deep learning, including datasets, models and tools, and discusses the disadvantages of the existing models and provides potential future trends.
G2Retro: Two-Step Graph Generative Models for Retrosynthesis Prediction
Retrosynthesis is a procedure where a molecule is transformed into potential reactants and thus the synthesis routes are identified. We propose a novel generative framework, denoted as G 2 Retro , for
Prediction and Interpretable Visualization of Retrosynthetic Reactions Using Graph Convolutional Networks
This paper proposes an interpretable prediction framework using Graph Convolutional Networks (GCN) for retrosynthetic reaction prediction and Integrated Gradients (IGs) for visualization of contributions to the prediction to address these challenges.
Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy
This work introduces new metrics (coverage, class diversity, round-trip accuracy and Jensen-Shannon divergence) to evaluate the single-step retrosynthetic models, using the forward prediction and a reaction classification model always based on the transformer architecture.
Unassisted Noise Reduction of Chemical Reaction Data Sets
A machine learning-based, unassisted approach to remove chemically wrong entries from chemical reaction collections is proposed, which is the first unassisted rule-free technique to address automatic noise reduction in chemical data sets.
Advances in De Novo Drug Design: From Conventional to Machine Learning Methods
This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and high-lights hot topics for further development.


Planning chemical syntheses with deep neural networks and symbolic AI
This work combines Monte Carlo tree search with an expansion policy network that guides the search, and a filter network to pre-select the most promising retrosynthetic steps that solve for almost twice as many molecules, thirty times faster than the traditional computer-aided search method.
Molecular Transformer for Chemical Reaction Prediction and Uncertainty Estimation
This work treats reaction prediction as a machine translation problem between SMILES strings of reactants-reagents and the products, and shows that a multi-head attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark dataset.
Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
A fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule- based expert system component.
Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction.
It is reported that deep neural networks can learn to resolve reactivity conflicts and to prioritize the most suitable transformation rules.
Machine Learning in Computer-Aided Synthesis Planning.
Two critical aspects of CASP and recent machine learning approaches to both challenges are focused on, including the problem of retrosynthetic planning and anticipating the products of chemical reactions, which can be used to validate proposed reactions in a computer-generated synthesis plan.
Computer-Assisted Retrosynthesis Based on Molecular Similarity
We demonstrate molecular similarity to be a surprisingly effective metric for proposing and ranking one-step retrosynthetic disconnections based on analogy to precedent reactions. The developed
Linking the Neural Machine Translation and the Prediction of Organic Chemistry Reactions
A gated recurrent unit based sequence-to-sequence model and a parser to generate input tokens for model from reaction SMILES strings were built to translate 'reactants and reagents' to 'products'.
Computer‐aided synthesis design: 40 years on
This review compares and contrasts the diverse approaches taken by selected programs in both the design and implementation of molecule feature perception and reaction rule representation, and it is argued that the progress achieved in this aspect paves the way to a deeper exploration of computer approaches to applying strategy and control in the synthesis problem.
Learning to Predict Chemical Reactions
This work describes single mechanistic reactions as interactions between coarse approximations of molecular orbitals (MOs) and use topological and physicochemical attributes as descriptors and proposes a new approach to reaction prediction utilizing elements from each pole.
Identifying Structure-Property Relationships through SMILES Syntax Analysis with Self-Attention Mechanism
This work proposes a new method to identifying SAR/SPR through linear notation (for example, SMILES) syntax analysis with self-attention mechanism, an interpretable deep learning architecture, and demonstrates that the method yields superior performance compared with state-of-the-art models.