“Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models† †Electronic supplementary information (ESI) available: Time-split test set and example predictions, together with attention weights, confidence and token probabilities. See DO

@inproceedings{Schwaller2017FoundIT,
  title={“Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models†
†Electronic supplementary information (ESI) available: Time-split test set and example predictions, together with attention weights, confidence and token probabilities. See DO},
  author={Philippe Schwaller and Th{\'e}ophile Gaudin and David Lanyi and Constantine Bekas and Teodoro Laino},
  booktitle={Chemical science},
  year={2017}
}
There is an intuitive analogy of an organic chemist's understanding of a compound and a language speaker's understanding of a word. Based on this analogy, it is possible to introduce the basic concepts and analyze potential impacts of linguistic analysis to the world of organic chemistry. In this work, we cast the reaction prediction task as a translation problem by introducing a template-free sequence-to-sequence model, trained end-to-end and fully data-driven. We propose a tokenization, which… CONTINUE READING

Similar Papers

Tables, Results, and Topics from this paper.

Key Quantitative Results

  • Using an attention-based model borrowed from human language translation, we improve the state-of-the-art solutions in reaction prediction on the top-1 accuracy by achieving 80.3% without relying on auxiliary knowledge, such as reaction templates or explicit atomic features.

Explore Further: Topics Discussed in This Paper

Citations

Publications citing this paper.
SHOWING 1-10 OF 32 CITATIONS

Molecular Transformer for Chemical Reaction Prediction and Uncertainty Estimation

VIEW 12 EXCERPTS
CITES METHODS, BACKGROUND & RESULTS
HIGHLY INFLUENCED

Graph Transformation Policy Network for Chemical Reaction Prediction

VIEW 5 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

References

Publications referenced by this paper.
SHOWING 1-10 OF 39 REFERENCES

Chemical reactions from US patents (1976-Sep2016) (2017)

D. Lowe
  • URL https://figshare.com/
  • 1976
VIEW 7 EXCERPTS
HIGHLY INFLUENTIAL

Chemical reactions from US patents (1976-Sep2016) (2017)

D. Lowe
  • URL https://figshare.com/
  • 1976
VIEW 7 EXCERPTS
HIGHLY INFLUENTIAL

URL https://zenodo.org/ record/1004356#.Wd3LDY6l2EI

G Landrum
  • Release
  • 2017
VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules

  • Journal of Chemical Information and Computer Sciences
  • 1988
VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL