# On Modern Deep Learning and Variational Inference

@inproceedings{Gal2015OnMD, title={On Modern Deep Learning and Variational Inference}, author={Y. Gal and Zoubin Ghahramani}, year={2015} }

Bayesian modelling and variational inference are rooted in Bayesian statistics, and easily benefit from the vast literature in the field. In contrast, deep learning lacks a solid mathematical grounding. Instead, empirical developments in deep learning are often justified by metaphors, evading the unexplained principles at play. It is perhaps astonishing then that most modern deep learning models can be cast as performing approximate variational inference in a Bayesian setting. This… Expand

#### 13 Citations

MARIO HUMBERTO BECERRA CONTRERAS A comparison of frequentist methods and Bayesian approximations in the implementation of Convolutional Neural Networks in an Active Learning setting

In this work, an approximate Bayesian approach for Deep Learning is compared with a conventional approach, all within a context of Active Learning. The conventional approach is based on the… Expand

ShapeOdds: Variational Bayesian Learning of Generative Shape Models

- Mathematics, Computer Science
- 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017

A variational approach for learning a latent variable model in which it is demonstrated that the proposed model generates realistic samples, generalizes to unseen examples, and is able to handle missing regions and/or background clutter, while comparing favorably with recent, neural-network-based approaches. Expand

Variational Bayesian Parameter Estimation Techniques for the General Linear Model

- Computer Science, Medicine
- Front. Neurosci.
- 2017

The conceptual and formal underpinnings of VB, VML, ReML, and ML are revisited and a detailed account of their mathematical relationships and implementational details are provided. Expand

Explainable Artificial Intelligence via Bayesian Teaching

- 2017

Modern machine learning methods are increasingly powerful and opaque. This opaqueness is a concern across a variety of domains in which algorithms are making important decisions that should be… Expand

Addressing model uncertainty in probabilistic forecasting using Monte Carlo dropout

- Computer Science
- 2020

In recent years, deep learning models have been developed to address probabilistic forecasting tasks, assuming an implicit stochastic process that relates past observed values to uncertain future… Expand

Image classification with a MSF dropout

- Computer Science
- Multimedia Tools and Applications
- 2019

A multi-scale fusion (MSF) dropout method on the basis of standard dropout is proposed, which shows that the prediction accuracy is significantly improved compared with the other two kinds of dropout, which verifies the effectiveness of the multi- scale fusion method. Expand

Dropout distillation

- Computer Science
- ICML
- 2016

This work introduces a novel approach, coined "dropout distillation", that allows to train a predictor in a way to better approximate the intractable, but preferable, averaging process, while keeping under control its computational efficiency. Expand

Reliable Deep Grade Prediction with Uncertainty Estimation

- Computer Science
- LAK
- 2019

Two types of Bayesian deep learning models for grade prediction under a course-specific framework are presented, based on the assumption that prior courses can provide students with knowledge for future courses so that grades of prior course can be used to predict grades in a future course. Expand

Master Thesis: Data-Driven Machine Learning of Turbulence Models

- 2018

In this project the student will implement her/his own version of a learning machine, leveraging the underlying laws of physics to then extract high-dimensional data patterns from Computational Fluid… Expand

Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

- Computer Science, Mathematics
- NIPS
- 2017

By lifting the ReLU function into a higher dimensional space, this work develops a smooth multi-convex formulation for training feed-forward deep neural networks (DNNs) and proves that this BCD algorithm will converge globally to a stationary point with R-linear convergence rate of order one. Expand

#### References

SHOWING 1-10 OF 19 REFERENCES

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

- Mathematics, Computer Science
- ICML
- 2016

A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy. Expand

Dropout as a Bayesian Approximation : Insights and Applications

- 2015

Deep learning techniques are used more and more often, but they lack the ability to reason about uncertainty over the features. Features extracted from a dataset are given as point estimates, and do… Expand

Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference

- Computer Science, Mathematics
- ArXiv
- 2015

This work presents an efficient Bayesian CNN, offering better robustness to over-fitting on small data than traditional approaches, and approximate the model's intractable posterior with Bernoulli variational distributions. Expand

Ensemble learning in Bayesian neural networks

- Mathematics
- 1998

Bayesian treatments of learning in neural networks are typically based either on a local Gaussian approximation to a mode of the posterior weight distribution, or on Markov chain Monte Carlo… Expand

Practical Variational Inference for Neural Networks

- Computer Science, Mathematics
- NIPS
- 2011

This paper introduces an easy-to-implement stochastic variational method (or equivalently, minimum description length loss function) that can be applied to most neural networks and revisits several common regularisers from a variational perspective. Expand

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

- Computer Science
- 2005

The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and includes detailed algorithms for supervised-learning problem for both regression and classification. Expand

Learning Multiple Layers of Features from Tiny Images

- Computer Science
- 2009

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network. Expand

Dropout: a simple way to prevent neural networks from overfitting

- Computer Science
- J. Mach. Learn. Res.
- 2014

It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. Expand

Weight Uncertainty in Neural Networks

- Mathematics, Computer Science
- ArXiv
- 2015

This work introduces a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop, and shows how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems. Expand

ImageNet classification with deep convolutional neural networks

- Computer Science
- Commun. ACM
- 2012

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand