• Corpus ID: 219636373

A benchmark study on reliable molecular supervised learning via Bayesian learning

@article{Hwang2020ABS,
  title={A benchmark study on reliable molecular supervised learning via Bayesian learning},
  author={Doyeong Hwang and Grace Lee and Hanseok Jo and Seyoul Yoon and Seongok Ryu},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.07021}
}
Virtual screening aims to find desirable compounds from chemical library by using computational methods. For this purpose with machine learning, model outputs that can be interpreted as predictive probability will be beneficial, in that a high prediction score corresponds to high probability of correctness. In this work, we present a study on the prediction performance and reliability of graph neural networks trained with the recently proposed Bayesian learning algorithms. Our work shows that… 

Figures and Tables from this paper

Bayesian Graph Neural Networks for Molecular Property Prediction
TLDR
This study benchmarks a set of Bayesian methods applied to a directed MPNN, using the QM9 regression dataset, and finds that capturing uncertainty in both readout and message passing parameters yields enhanced predictive accuracy, calibration, and performance on a downstream molecular search task.
Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift
TLDR
This work introduces CardioTox, a real-world benchmark on drug cardiotoxicity to facilitate reliability research on Graph Neural Networks, and demonstrates GNN-SNGP’s effectiveness in increasing distance-awareness, reducing overconfident mispredictions and making better calibrated predictions without sacrificing accuracy performance.
Gaussian Process Molecule Property Prediction with FlowMO
TLDR
FlowMO is an open-source Python library for molecular property prediction with Gaussian Processes built upon GPflow and RDKit that demonstrates comparable predictive performance to deep learning methods but with superior uncertainty calibration.

References

SHOWING 1-10 OF 27 REFERENCES
MoleculeNet: A Benchmark for Molecular Machine Learning
TLDR
MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for molecular machine learning and broadly offer the best performance, however, this result comes with caveats.
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
TLDR
This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates.
Bayesian Deep Learning and a Probabilistic Perspective of Generalization
TLDR
It is shown that deep ensembles provide an effective mechanism for approximate Bayesian marginalization, and a related approach is proposed that further improves the predictive distribution by marginalizing within basins of attraction, without significant overhead.
On Calibration of Modern Neural Networks
TLDR
It is discovered that modern neural networks, unlike those from a decade ago, are poorly calibrated, and on most datasets, temperature scaling -- a single-parameter variant of Platt Scaling -- is surprisingly effective at calibrating predictions.
Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
TLDR
A large-scale benchmark of existing state-of-the-art methods on classification problems and the effect of dataset shift on accuracy and calibration is presented, finding that traditional post-hoc calibration does indeed fall short, as do several other previous methods.
Bayesian Learning via Stochastic Gradient Langevin Dynamics
In this paper we propose a new framework for learning from large scale datasets based on iterative learning from small mini-batches. By adding the right amount of noise to a standard stochastic
Weight Uncertainty in Neural Networks
TLDR
This work introduces a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop, and shows how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems.
Benchmarking Graph Neural Networks
TLDR
A reproducible GNN benchmarking framework is introduced, with the facility for researchers to add new models conveniently for arbitrary datasets, and a principled investigation into the recent Weisfeiler-Lehman GNNs (WL-GNNs) compared to message passing-based graph convolutional networks (GCNs).
A Simple Baseline for Bayesian Uncertainty in Deep Learning
TLDR
It is demonstrated that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.
Semi-Supervised Classification with Graph Convolutional Networks
TLDR
A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.
...
...