Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

@article{GmezBombarelli2018AutomaticCD,
  title={Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules},
  author={Rafael G{\'o}mez-Bombarelli and David Kristjanson Duvenaud and Jos{\'e} Miguel Hern{\'a}ndez-Lobato and Jorge Aguilera-Iparraguirre and Timothy D. Hirzel and Ryan P. Adams and Al{\'a}n Aspuru-Guzik},
  journal={ACS Central Science},
  year={2018},
  volume={4},
  pages={268 - 276}
}
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule… Expand
Learning Continuous and Data-Driven Molecular Descriptors by Translating Equivalent Chemical Representations
There has been a recent surge of interest in using machine learning across chemical space in order to predict properties of molecules or design molecules and materials with desired properties. MostExpand
3DMolNet: A Generative Network for Molecular Structures
TLDR
This work proposes a new approach to efficiently generate molecular structures that are not restricted to a fixed size or composition, based on the variational autoencoder which learns a translation-, rotation-, and permutation-invariant low-dimensional representation of molecules. Expand
Learning a Continuous Representation of 3D Molecular Structures with Deep Generative Models
TLDR
Deep generative models for three dimensional molecular structures using atomic density grids and a novel fitting algorithm that converts continuous grids to discrete molecular structures are described. Expand
Deep Molecular Dreaming: Inverse machine learning for de-novo molecular design and interpretability with surjective representations
TLDR
PASITHEA is proposed, a direct gradient-based molecule optimization that applies inceptionism techniques from computer vision that forms an inverse regression model, which is capable of generating molecular variants optimized for a certain property. Expand
Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures
TLDR
A method to encode and decode the position of atoms in 3-D molecules from a dataset of nearly 50,000 stable crystal unit cells that vary from containing 1 to over 100 atoms is presented. Expand
Representation of molecular structures with persistent homology for machine learning applications in chemistry
TLDR
A persistence homology based molecular representation derived from persistent homology is demonstrated through an active-learning approach for predicting CO 2 /N 2 interaction energies at the density functional theory (DFT) level. Expand
ChemoVerse: Manifold traversal of latent spaces for novel molecule discovery
TLDR
This work presents a manifold traversal with heuristic search to explore the latent chemical space using various generative models such as grammar variational autoencoders as they deal with the randomized generation and validity of compounds. Expand
Optimization of Molecular Characteristics via Machine Learning Based on Continuous Representation of Molecules
We demonstrate an automatic materials design method using continuous representation of molecule and its atomic arrangement via a neural network algorithm. This method is applied to optimizing andExpand
Actively Searching: Inverse Design of Novel Molecules with Simultaneously Optimized Properties
TLDR
This work demonstrates an active learning approach to improve the performance of multi-target generative chemical models by utilizing their inherent generative and predictive aspects for self-refinement in situations where any number of properties with varying degrees of correlation must be optimized simultaneously. Expand
Molecular Optimization by Capturing Chemist's Intuition Using Deep Neural Networks
TLDR
This work seeks to capture the chemist’s intuition from matched molecular pairs using machine translation models and shows that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 80 REFERENCES
ChemTS: an efficient python library for de novo molecular generation
TLDR
A novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN is presented, which showed superior efficiency in finding high-scoring molecules in a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability. Expand
Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks
TLDR
This work shows that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing, and demonstrates that the properties of the generated molecules correlate very well with those of the molecules used to train the model. Expand
Application of Generative Autoencoder in De Novo Molecular Design
TLDR
The results show that the latent space preserves chemical similarity principle and thus can be used for the generation of analogue structures in autoencoder for de novo molecular design. Expand
Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
TLDR
This work shows that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing, and demonstrates that the properties of the generated molecules correlate very well with those of the molecules used to train the model. Expand
Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space
TLDR
A systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules and is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. Expand
Quantum-chemical insights from deep tensor neural networks
TLDR
An efficient deep learning approach is developed that enables spatially and chemically resolved insights into quantum-mechanical observables of molecular systems, and unifies concepts from many-body Hamiltonians with purpose-designed deep tensor neural networks, which leads to size-extensive and uniformly accurate chemical space predictions. Expand
Chemical space as a source for new drugs
The chemical space is the ensemble of all possible molecules, which is believed to contain at least 1060 organic molecules below 500 Da of possible interest for drug discovery. This review summarizesExpand
Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds.
TLDR
The construction of a "representative universal library" spanning the SMU that samples the full extent of feasible small molecule chemistries is described, generated using the newly developed Algorithm for Chemical Space Exploration with Stochastic Search (ACSESS). Expand
Designing molecules by optimizing potentials.
TLDR
It is shown that the optimal structures can be determined without enumerating and separately evaluating the characteristics of the combinatorial number of possible structures, a process that would be much slower. Expand
Chemical Space Travel
TLDR
A “spaceship” program is reported which travels from a starting molecule A to a target molecule B through a continuum of structural mutations, and thereby charts unexplored chemical space. Expand
...
1
2
3
4
5
...