• Corpus ID: 219636373

A benchmark study on reliable molecular supervised learning via Bayesian learning

@article{Hwang2020ABS,
  title={A benchmark study on reliable molecular supervised learning via Bayesian learning},
  author={Doyeong Hwang and Grace Lee and Hanseok Jo and Seyoul Yoon and Seongok Ryu},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.07021}
}
Virtual screening aims to find desirable compounds from chemical library by using computational methods. For this purpose with machine learning, model outputs that can be interpreted as predictive probability will be beneficial, in that a high prediction score corresponds to high probability of correctness. In this work, we present a study on the prediction performance and reliability of graph neural networks trained with the recently proposed Bayesian learning algorithms. Our work shows that… 

Figures and Tables from this paper

Bayesian Graph Neural Networks for Molecular Property Prediction

TLDR
This study benchmarks a set of Bayesian methods applied to a directed MPNN, using the QM9 regression dataset, and finds that capturing uncertainty in both readout and message passing parameters yields enhanced predictive accuracy, calibration, and performance on a downstream molecular search task.

Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift

TLDR
This work introduces CardioTox, a real-world benchmark on drug cardiotoxicity to facilitate reliability research on Graph Neural Networks, and demonstrates GNN-SNGP’s effectiveness in increasing distance-awareness, reducing overconfident mispredictions and making better calibrated predictions without sacrificing accuracy performance.

Gaussian Process Molecule Property Prediction with FlowMO

TLDR
FlowMO is an open-source Python library for molecular property prediction with Gaussian Processes built upon GPflow and RDKit that demonstrates comparable predictive performance to deep learning methods but with superior uncertainty calibration.

References

SHOWING 1-10 OF 27 REFERENCES

MoleculeNet: A Benchmark for Molecular Machine Learning

TLDR
MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance, however, this result comes with caveats.

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

TLDR
This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates.

Bayesian Deep Learning and a Probabilistic Perspective of Generalization

TLDR
It is shown that deep ensembles provide an effective mechanism for approximate Bayesian marginalization, and a related approach is proposed that further improves the predictive distribution by marginalizing within basins of attraction, without significant overhead.

On Calibration of Modern Neural Networks

TLDR
It is discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated, and on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions.

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

TLDR
A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.

How Good is the Bayes Posterior in Deep Neural Networks Really?

TLDR
This work demonstrates through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD and argues that it is timely to focus on understanding the origin of the improved performance of cold posteriors.

Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

TLDR
A large-scale benchmark of existing state-of-the-art methods on classification problems and the effect of dataset shift on accuracy and calibration is presented, finding that traditional post-hoc calibration does indeed fall short, as do several other previous methods.

Deep Ensembles: A Loss Landscape Perspective

TLDR
Developing the concept of the diversity--accuracy plane, it is shown that the decorrelation power of random initializations is unmatched by popular subspace sampling methods and the experimental results validate the hypothesis that deep ensembles work well under dataset shift.

Bayesian Learning via Stochastic Gradient Langevin Dynamics

In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic

Weight Uncertainty in Neural Networks

TLDR
This work introduces a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop, and shows how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems.