Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery

@article{Preuer2018FrchetCD,
  title={Fr{\'e}chet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery},
  author={Kristina Preuer and Philipp Renz and Thomas Unterthiner and Sepp Hochreiter and G{\"u}nter Klambauer},
  journal={Journal of chemical information and modeling},
  year={2018},
  volume={58 9},
  pages={
          1736-1741
        }
}
The new wave of successful generative models in machine learning has increased the interest in deep learning driven de novo drug design. However, method comparison is difficult because of various flaws of the currently employed evaluation metrics. We propose an evaluation metric for generative models called Fréchet ChemNet distance (FCD). The advantage of the FCD over previous metrics is that it can detect whether generated molecules are diverse and have similar chemical and biological… 

Figures from this paper

On failure modes in molecule generation and optimization.
On Failure Modes of Molecule Generators and Optimizers
TLDR
This work highlights some unintended failure modes of generative models and how these evade detection by current performance metrics.
Generate Novel Molecules With Target Properties Using Conditional Generative Models
TLDR
This paper presents a novel neural network for generating small molecules similar to the ones in the training set, which outperforms previous methods using Molecular weight, LogP and Quantitative Estimation of Drug-likeness as the evaluation metrics.
RELATION: A Deep Generative Model for Structure-Based De Novo Drug Design.
TLDR
The calculation results demonstrated that the RELATION model could efficiently generate novel molecules with favorable binding affinity and pharmacophore features and was used to design inhibitors for two targets, AKT1 and CDK2.
Multi-Objective Molecule Generation using Interpretable Substructures
TLDR
This work proposes to offset the complexity of the generative modeling of molecules by composing molecules from a vocabulary of substructures that are likely responsible for each property of interest, called molecular rationales.
Masked graph modeling for molecule generation
TLDR
A masked graph model is introduced, which learns a distribution over graphs by capturing conditional distributions over unobserved nodes and edges given observed ones, and which outperforms previously proposed graph-based approaches and is competitive with SMILES- based approaches.
Deep learning for molecular generation.
TLDR
Recent development of deep learning models for molecular generation are discussed as four different generative architectures with four different optimization strategies and future directions of deep generative models for de novo drug design are discussed.
Deep Learning methods for generative learning and applications in drug discovery
  • Computer Science, Biology
  • 2019
TLDR
The aim of this thesis is to explore those multiple leverage points for Deep Learning models in order to facilitate the multifaceted process of drug discovery, and to propose a method for deep neural networks designed to predict biological activities and provide the most indicative structures for the present task.
Generating Customized Compound Libraries for Drug Discovery with Machine Intelligence
TLDR
A deep learning framework for customized compound library generation is presented, aiming to enrich and expand the pharmacologically relevant chemical space with new molecular entities ‘on demand’.
Direct Steering of de novo Molecular Generation using Descriptor Conditional Recurrent Neural Networks (cRNNs)
TLDR
This work proposes a simple approach to the focused generative task by constructing a conditional recurrent neural network (cRNN) that is able to generate molecules near multiple specified conditions, while maintaining an output that is more focused than traditional RNNs yet less focused than autoencoders.
...
...

References

SHOWING 1-10 OF 36 REFERENCES
ChemGAN challenge for drug discovery: can AI reproduce natural chemical diversity?
Generating molecules with desired chemical properties is important for drug discovery. The use of generative neural networks is promising for this task. However, from visual inspection, it often
Can AI reproduce observed chemical diversity?
Generating diverse molecules with desired chemical properties is important for drug discovery. The use of generative neural networks is promising for this task. To facilitate evaluation of generative
Large-scale comparison of machine learning methods for drug target prediction on ChEMBL† †Electronic supplementary information (ESI) available: Overview, Data Collection and Clustering, Methods, Results, Appendix. See DOI: 10.1039/c8sc00148k
The to date largest comparative study of nine state-of-the-art drug target prediction methods finds that deep learning outperforms all other competitors. The results are based on a benchmark of 1300
Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks
TLDR
This work shows that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing, and demonstrates that the properties of the generated molecules correlate very well with those of the molecules used to train the model.
ChemTS: an efficient python library for de novo molecular generation
TLDR
A novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN is presented, which showed superior efficiency in finding high-scoring molecules in a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability.
Deep reinforcement learning for de novo drug design
TLDR
The ReLeaSE method is used to design chemical libraries with a bias toward structural complexity or toward compounds with maximal, minimal, or specific range of physical properties, such as melting point or hydrophobicity.
Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models
TLDR
This work builds upon previous results that incorporated GANs and RL in order to generate sequence data and test this model in several settings for the generation of molecules encoded as text sequences and in the context of music generation, showing for each case that it can effectively bias the generation process towards desired metrics.
Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration
Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions
TLDR
This method uses historical synthetic knowledge obtained by analyzing information from millions of already synthesized chemicals and considers also molecule complexity, which is sufficiently fast and provides results consistent with estimation of ease of synthesis by experienced medicinal chemists.
Learning Deep Generative Models of Graphs
TLDR
This work is the first and most general approach for learning generative models over arbitrary graphs, and opens new directions for moving away from restrictions of vector- and sequence-like knowledge representations, toward more expressive and flexible relational data structures.
...
...