JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design

  title={JANUS: Parallel Tempered Genetic Algorithm Guided by Deep Neural Networks for Inverse Molecular Design},
  author={AkshatKumar Nigam and Robert Pollice and Al{\'a}n Aspuru-Guzik},
Inverse molecular design involves algorithms that sample molecules with specific target properties from a multitude of candidates and can be posed as an optimization problem. High-dimensional optimization tasks in the... 

Figures and Tables from this paper

A molecular generative model with genetic algorithm and tree search for cancer samples
FasterGTS is constructed with a genetic algorithm and a Monte Carlo tree search with three deep neural networks: supervised learning, self-trained, and value networks, and it generates anticancer molecules based on the genetic profiles of a cancer sample.
Improving De Novo Molecular Design with Curriculum Learning
This work implements CL in the de novo design platform, REINVENT, and applies it on illustrative de noVO molecular design problems of different complexity, showing both accelerated learning and a positive impact on the quality of the output when compared to standard policy based RL.
An In-depth Summary of Recent Artificial Intelligence Applications in Drug Design
This survey includes the theoretical development of the previously mentioned AI models and detailed summaries of 42 recent applications of AI in drug design, and provides a holistic discussion of the abundant applications so that the tasks, potential solutions, and challenges in AI-based drug design become evident.
Fragment-based Sequential Translation for Molecular Optimization
Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties, is proposed.
Fragment-based Sequential Translation for Molecular Optimization FRAGMENT-BASED SEQUENTIAL TRANSLATION FOR MOLECULAR OPTIMIZATION
Fragment-based Sequential Tanslation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties, is proposed.
Model agnostic generation of counterfactual explanations for molecules
This work shows a universal model-agnostic approach that can explain any black-box model prediction and demonstrates this method on random forest models, sequence models, and graph neural networks in both classification and regression.
A graph neural network approach for molecule carcinogenicity prediction
This work proposes a model for carcinogenicity prediction, CONCERTO, which uses a graph transformer in conjunction with a molecular fingerprint representation, trained on multi-round muta-genicity and carcinogensicity objectives, and yields results superior to alternate approaches for molecular carcinogenic prediction.
Uncertainty-aware Mixed-variable Machine Learning for Materials Design
This work surveys frequentist and Bayesian approaches to uncertainty quantification of machine learning with mixed variables, investigating the machine learning models’ predictive and uncertainty estimation capabilities, and provides interpretations of the observed performance differences.
Realistic mask generation for matter-wave lithography via machine learning
Fast production of large area patterns with nanometre resolution is crucial for the established semiconductor industry and for enabling industrial scale production of next generation quantum devices.
Autonomous Reaction Network Exploration in Homogeneous and Heterogeneous Catalysis
Autonomous computations that rely on automated reaction network elucidation algorithms may pave the way to make computational catalysis on a par with experimental research in the field. Several


Augmenting Genetic Algorithms with Deep Neural Networks for Exploring the Chemical Space
A genetic algorithm is presented that is enhanced with a neural network based discriminator model to improve the diversity of generated molecules and at the same time steer the GA, and it is shown that this algorithm outperforms other generative models in optimization tasks.
ChemTS: an efficient python library for de novo molecular generation
A novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN is presented, which showed superior efficiency in finding high-scoring molecules in a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability.
A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space
This paper presents a comparison of a graph-based genetic algorithm (GB-GA) and machine learning (ML) results for the optimization of log P values with a constraint for synthetic accessibility and
GuacaMol: Benchmarking Models for De Novo Molecular Design
This work proposes an evaluation framework, GuacaMol, based on a suite of standardized benchmarks, to standardize the assessment of both classical and neural models for de novo molecular design, and describes a variety of single and multiobjective optimization tasks.
Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (STONED) algorithm for molecules using SELFIES†
STONED is proposed – a simple and efficient algorithm to perform interpolation and exploration in the chemical space, comparable to deep generative models, bypassing the need for large amounts of data and training times by using string modifications in the SELFIES molecular representation.
Optimizing distributions over molecular space. An Objective-Reinforced Generative Adversarial Network for Inverse-design Chemistry (ORGANIC)
This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes.
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning
A novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design, Policy Gradient for Forward Synthesis (PGFS), that addresses this challenge by embedding the concept of synthetic accessibility directly into the de noVO drug design system.
Goal directed molecule generation using Monte Carlo Tree Search
This work proposes a novel method, which they call unitMCTS, to perform molecule generation by making a unit change to the molecule at every step using Monte Carlo Tree Search, and shows that this method outperforms the recently published techniques on benchmark molecular optimization tasks such as QED and penalized logP.
Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks
This work shows that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing, and demonstrates that the properties of the generated molecules correlate very well with those of the molecules used to train the model.