• Corpus ID: 204838191

Stabilising priors for robust Bayesian deep learning

  title={Stabilising priors for robust Bayesian deep learning},
  author={Felix Mcgregor and Arnu Pretorius and Johan A. du Preez and Steve Kroon},
Bayesian neural networks (BNNs) have developed into useful tools for probabilistic modelling due to recent advances in variational inference enabling large scale BNNs. However, BNNs remain brittle and hard to train, especially: (1) when using deep architectures consisting of many hidden layers and (2) in situations with large weight variances. We use signal propagation theory to quantify these challenges and propose self-stabilising priors. This is achieved by a reformulation of the ELBO to… 

Figures from this paper

Graph reparameterizations for enabling 1000+ Monte Carlo iterations in Bayesian deep neural networks

A framework is constructed to describe these computation graphs, and identify probability families where the graph size can be independent or only weakly dependent on the number of MC samples, which correspond directly to large classes of distributions.

Radial Spike and Slab Bayesian Neural Networks for Sparse Data in Ransomware Attacks

The Radial Spike and Slab Bayesian neural network is proposed, which is a new type of Bayesian Neural network that includes a new form of the approximate posterior distribution and is proposed to represent low-level events as MITRE ATT&CK tactics, techniques, and procedures.



Deterministic Variational Inference for Robust Bayesian Neural Networks

This work introduces a novel deterministic method to approximate moments in neural networks, eliminating gradient variance and introduces a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances, and demonstrates good predictive performance over alternative approaches.

Critical initialisation for deep signal propagation in noisy rectifier neural networks

A new framework for signal propagation in stochastic regularised neural networks is developed based on mean field theory, which shows that no critical initialisation strategy exists using additive noise, with signal propagation exploding regardless of the selected noise distribution.

Stochastic Backpropagation and Approximate Inference in Deep Generative Models

We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and

Practical Variational Inference for Neural Networks

This paper introduces an easy-to-implement stochastic variational method (or equivalently, minimum description length loss function) that can be applied to most neural networks and revisits several common regularisers from a variational perspective.

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

This work proposes an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates.

Auto-Encoding Variational Bayes

A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.

Weight Uncertainty in Neural Networks

This work introduces a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop, and shows how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems.

Variational Dropout and the Local Reparameterization Trick

The Variational dropout method is proposed, a generalization of Gaussian dropout, but with a more flexibly parameterized posterior, often leading to better generalization in stochastic gradient variational Bayes.

Deep Information Propagation

The presence of dropout destroys the order-to-chaos critical point and therefore strongly limits the maximum trainable depth for random networks, and a mean field theory for backpropagation is developed that shows that the ordered and chaotic phases correspond to regions of vanishing and exploding gradient respectively.