• Corpus ID: 165163661

A COLD Approach to Generating Optimal Samples

@article{Mahmood2019ACA,
  title={A COLD Approach to Generating Optimal Samples},
  author={Omar Mahmood and Jos{\'e} Miguel Hern{\'a}ndez-Lobato},
  journal={ArXiv},
  year={2019},
  volume={abs/1905.09885}
}
Optimising discrete data for a desired characteristic using gradient-based methods involves projecting the data into a continuous latent space and carrying out optimisation in this space. Carrying out global optimisation is difficult as optimisers are likely to follow gradients into regions of the latent space that the model has not been exposed to during training; samples generated from these regions are likely to be too dissimilar to the training data to be useful. We propose Constrained… 

Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining

TLDR
An improved method for efficient black-box optimization is introduced, which performs the optimization in the low-dimensional, continuous latent manifold learned by a deep generative model, which can be easily implemented on top of existing methods.

Mixed-Variable Bayesian Optimization

TLDR
MiVaBo is introduced, a novel BO algorithm for the efficient optimization of mixed-variable functions combining a linear surrogate model based on expressive feature representations with Thompson sampling, making MiVaBo the first BO method that can handle complex constraints over the discrete variables.

Masked graph modeling for molecule generation

TLDR
A masked graph model is introduced, which learns a distribution over graphs by capturing conditional distributions over unobserved nodes and edges given observed ones, and which outperforms previously proposed graph-based approaches and is competitive with SMILES- based approaches.

Constrained Bayesian Optimization for Automatic Chemical Design

TLDR
It is posited that constrained Bayesian optimization is a good approach for solving this class of training set mismatch in many generative tasks involving Bayesian optimized over the latent space of a variational autoencoder.

The Photoswitch Dataset: A Molecular Machine Learning Benchmark for the Advancement of Synthetic Chemistry

TLDR
The Photoswitch Dataset is introduced, a new benchmark for molecular machine learning where improvements in model performance can be immediately observed in the throughput of promising molecules synthesized in the lab.

Constrained Bayesian optimization for automatic chemical design using variational autoencoders† †Electronic supplementary information (ESI) available: Additional experimental results validating the algorithm configuration on the toy Branin-Hoo function. See DOI: 10.1039/c9sc04026a

TLDR
Automatic Chemical Design is a framework for generating novel molecules with optimized properties that can be applied to solve the challenge of designing complex molecules with novel properties.

Achieving robustness to aleatoric uncertainty with heteroscedastic Bayesian optimisation

TLDR
This paper proposes a heteroscedastic Bayesian optimisation scheme capable of representing and minimising aleatoric noise across the input space and introduces the aleATORic noise-penalised expected improvement (ANPEI) heuristic.

Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection

TLDR
This work proposes a new probabilistic, unsupervised approach to this problem based on a Bayesian variational autoencoder model, which estimates a full posterior distribution over the decoder parameters using stochastic gradient Markov chain Monte Carlo, instead of fitting a point estimate.

References

SHOWING 1-10 OF 11 REFERENCES

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration

Auto-Encoding Variational Bayes

TLDR
A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

Multi-objective de novo drug design with conditional graph generative model

TLDR
A new de novo molecular design framework is proposed based on a type of sequential graph generators that do not use atom level recurrent units, which is much more tuned for molecule generation and has been scaled up to cover significantly larger molecules in the ChEMBL database.

Constrained Bayesian Optimization for Automatic Chemical Design

TLDR
It is posited that constrained Bayesian optimization is a good approach for solving this class of training set mismatch in many generative tasks involving Bayesian optimized over the latent space of a variational autoencoder.

A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation

TLDR
An iterative algorithm for nonlinearly constrained optimization calculations when there are no derivatives, where a new vector of variables is calculated, which may replace one of the current vertices, either to improve the shape of the simplex or because it is the best vector that has been found so far.

Gradient-based learning applied to document recognition

TLDR
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.

Quantifying the chemical beauty of drugs.

TLDR
The utility of QED is extended by applying it to the problem of molecular target druggability assessment by prioritizing a large set of published bioactive compounds and may also capture the abstract notion of aesthetics in medicinal chemistry.

The ChEMBL database in 2017

TLDR
ChEMBL is an open large-scale bioactivity database that includes the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts.