Imposing Gaussian Pre-Activations in a Neural Network

@article{Wolinski2022ImposingGP,
  title={Imposing Gaussian Pre-Activations in a Neural Network},
  author={Pierre Wolinski and Julyan Arbel},
  journal={ArXiv},
  year={2022},
  volume={abs/2205.12379}
}
The goal of the present work is to propose a way to modify both the initialization distribution of the weights of a neural network and its activation function, such that all pre-activations are Gaussian. We propose a family of pairs initial-ization/activation, where the activation functions span a continuum from bounded functions (such as Heaviside or tanh) to the identity function. This work is motivated by the contradiction between existing works dealing with Gaussian pre-activations: on one… 

Figures from this paper

References

SHOWING 1-9 OF 9 REFERENCES

Bayesian neural network unit priors and generalized Weibull-tail property

TLDR
The main result is an accurate description of hidden units tails which shows that unit priors become heavier-tailed going deeper, thanks to the introduced notion of generalized Weibull-tail.

Bayesian Neural Network Priors Revisited

TLDR
It is found that fully connected networks (FCNNs) display heavytailed weight distributions, while convolutional neural network (CNN) weights display strong spatial correlations, and building these observations into the respective priors leads to improved performance on a variety of image classification datasets.

Least squares binary quantization of neural networks

  • H. PouransariOncel Tuzel
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2020
TLDR
This work introduces a novel 2-bits quantization with provably least squares error, and provides a unified framework to analyze different scaling strategies on the binary quantization, in which values are mapped to -1 and 1.

Implicit Regularization in Deep Matrix Factorization

TLDR
This work studies the implicit regularization of gradient descent over deep linear neural networks for matrix completion and sensing, a model referred to as deep matrix factorization, and finds that adding depth to a matrix factorizations enhances an implicit tendency towards low-rank solutions.

Sub‐Weibull distributions: Generalizing sub‐Gaussian and sub‐Exponential properties to heavier tailed distributions

We propose the notion of sub‐Weibull distributions, which are characterized by tails lighter than (or equally light as) the right tail of a Weibull distribution. This novel class generalizes the

Neural tangent kernel: convergence and generalization in neural networks (invited paper)

TLDR
This talk will introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features.

Gaussian Process Behaviour in Wide Deep Neural Networks

TLDR
It is shown that, under broad conditions, as the authors make the architecture increasingly wide, the implied random function converges in distribution to a Gaussian process, formalising and extending existing results by Neal (1996) to deep networks.

Exponential expressivity in deep neural networks through transient chaos

TLDR
The theoretical analysis of the expressive power of deep networks broadly applies to arbitrary nonlinearities, and provides a quantitative underpinning for previously abstract notions about the geometry of deep functions.

The perceptron: a probabilistic model for information storage and organization in the brain.

TLDR
This article will be concerned primarily with the second and third questions, which are still subject to a vast amount of speculation, and where the few relevant facts currently supplied by neurophysiology have not yet been integrated into an acceptable theory.