• Corpus ID: 160705

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

@article{Gal2016DropoutAA,
  title={Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning},
  author={Yarin Gal and Zoubin Ghahramani},
  journal={ArXiv},
  year={2016},
  volume={abs/1506.02142}
}
Deep learning tools have gained tremendous attention in applied machine learning. [] Key Result We show a considerable improvement in predictive log-likelihood and RMSE compared to existing state-of-the-art methods, and finish by using dropout's uncertainty in deep reinforcement learning.

Figures and Tables from this paper

Dropout as a Bayesian Approximation : Insights and Applications
TLDR
It is shown that a multilayer perceptron (MLP) with arbitrary depth and non-linearities, with dropout applied after every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model.
Bayesian Uncertainty Estimation for Batch Normalized Deep Networks
TLDR
It is shown that training a deep network using batch normalization is equivalent to approximate inference in Bayesian models, and it is demonstrated how this finding allows us to make useful estimates of the model uncertainty.
Variational Inference to Measure Model Uncertainty in Deep Neural Networks
TLDR
A novel approach for training deep neural networks in a Bayesian way that uses variational inference to approximate the intractable a posteriori distribution on basis of a normal prior and can be used to calculate credible intervals for the prediction and to optimize the network architecture for a given training data set.
Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout
TLDR
This work focuses on restrictions of the factorized structure of Dropout posterior which is inflexible to capture rich correlations among weight parameters of the true posterior, and proposes a novel method called Variational Structured Dropout (VSD) to overcome this limitation.
Novel Uncertainty Framework for Deep Learning Ensembles
TLDR
A novel statistical mechanics based framework to dropout is proposed and this framework is used to propose a new generic algorithm that focuses on estimates of the variance of the loss as measured by the ensemble of thinned networks.
Measuring the Uncertainty of Predictions in Deep Neural Networks with Variational Inference
TLDR
A novel approach for training deep neural networks in a Bayesian way that allows for quantifying the uncertainty in model parameters while only adding very few additional parameters to be optimized and can be used to calculate credible intervals for the network prediction and to optimize network architecture for the dataset at hand.
Training-Free Uncertainty Estimation for Dense Regression: Sensitivity as a Surrogate
TLDR
Three simple and scalable methods to analyze the variance of outputs from a trained network under tolerable perturbations are proposed: infer-transformation, infer-noise, and infer-dropout, which produce comparable or even better uncertainty estimation when compared to training-required state-of-the-art methods.
Learning a Hierarchy of Neural Connections for Modeling Uncertainty
TLDR
Treating the generative process of unlabeled data as a confounder is suggested, thereby conditioning the prior of the discriminative neural network on the parameters of theGenerative process, and this approach is ultimately translated to a compact hierarchy of sub-networks—a new deep architecture.
Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections
TLDR
This work proposes an approach for modeling this confounder by sharing neural connectivity patterns between the generative and discriminative networks, which leads to a new deep architecture, where networks are sampled from the posterior of local causal structures, and coupled into a compact hierarchy.
Dropout Strikes Back: Improved Uncertainty Estimation via Diversity Sampled Implicit Ensembles
TLDR
It is shown that increasing the diversity of realizations sampled from a neural network with dropout helps to improve the quality of uncertainty estimation, and diversification via determinantal point processes-based sampling allows achieving state-of-the-art results in uncertainty estimation for regression and classification tasks.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 71 REFERENCES
Ensemble learning in Bayesian neural networks
TLDR
This chapter shows how the ensemble learning approach can be extended to full-covariance Gaussian distributions while remaining computationally tractable, and extends the framework to deal with hyperparameters, leading to a simple re-estimation procedure.
Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks
TLDR
This work presents a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP), which works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients.
Stochastic Backpropagation and Approximate Inference in Deep Generative Models
We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and
Practical Bayesian Optimization of Machine Learning Algorithms
TLDR
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms.
Dropout: a simple way to prevent neural networks from overfitting
TLDR
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Auto-Encoding Variational Bayes
TLDR
A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
A Practical Bayesian Framework for Backpropagation Networks
  • D. Mackay
  • Computer Science
    Neural Computation
  • 1992
TLDR
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks that automatically embodies "Occam's razor," penalizing overflexible and overcomplex models.
Deep Gaussian Processes
TLDR
Deep Gaussian process (GP) models are introduced and model selection by the variational bound shows that a five layer hierarchy is justified even when modelling a digit data set containing only 150 examples.
Human-level control through deep reinforcement learning
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
TLDR
The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification.
...
1
2
3
4
5
...