• Corpus ID: 204904150

# Inherent Weight Normalization in Stochastic Neural Networks

@inproceedings{Detorakis2019InherentWN,
title={Inherent Weight Normalization in Stochastic Neural Networks},
author={Georgios Detorakis and Sourav Dutta and A. Khanna and Matthew Jerry and Suman Datta and Emre O. Neftci},
booktitle={NeurIPS},
year={2019}
}
• Published in NeurIPS 27 October 2019
• Computer Science
Multiplicative stochasticity such as Dropout improves the robustness and generalizability of deep neural networks. Here, we further demonstrate that always-on multiplicative stochasticity combined with simple threshold neurons are sufficient operations for deep neural networks. We call such models Neural Sampling Machines (NSM). We find that the probability of activation of the NSM exhibits a self-normalizing property that mirrors Weight Normalization, a previously studied mechanism that…
4 Citations

## Figures and Tables from this paper

### Neural sampling machine with stochastic synapse allows brain-like learning and inference

• Computer Science
Nature communications
• 2022
This work introduces a novel hardware fabric that can implement a new class of stochastic neural network called Neural Sampling Machine (NSM) by exploiting the stoChasticity in the synaptic connections for approximate Bayesian inference.

### Locally Learned Synaptic Dropout for Complete Bayesian Inference

• Computer Science, Biology
• 2021
This work defines a biologically constrained neural network and sampling scheme based on synaptic failure and lateral inhibition, derives drop-out based epistemic uncertainty, and proves an analytic mapping from synaptic efficacy to release probability that allows networks to sample from arbitrary, learned distributions represented by a receiving layer.

### Autonomous Probabilistic Coprocessing With Petaflips per Second

• Computer Science
IEEE Access
• 2020
This article explores sequencerless designs where all p-bits are allowed to flip autonomously and demonstrates that such designs can allow ultrafast operation unconstrained by available clock speeds without compromising the solution’s fidelity.

### Supervised Learning in All FeFET-Based Spiking Neural Network: Opportunities and Challenges

• Computer Science
Frontiers in Neuroscience
• 2020
This work proposes an all FeFET-based SNN hardware that allows low-power spike-based information processing and co-localized memory and computing (a.k.a. in-memory computing), and implements a surrogate gradient (SG) learning algorithm on the SNN platform that allows us to perform supervised learning on MNIST dataset.

## References

SHOWING 1-10 OF 55 REFERENCES

### Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

• Computer Science
ICML
• 2015
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

### Self-Normalizing Neural Networks

• Computer Science
NIPS
• 2017
Self-normalizing neural networks (SNNs) are introduced to enable high-level abstract representations and it is proved that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero meanand unit variance -- even under the presence of noise and perturbations.

### Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

• Computer Science
ArXiv
• 2013
This work considers a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network.

### Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines

• Computer Science, Biology
Front. Neurosci.
• 2016
The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware.

### MuProp: Unbiased Backpropagation for Stochastic Neural Networks

• Computer Science
ICLR
• 2016
MuProp is presented, an unbiased gradient estimator for stochastic networks, designed to make this task easier by improving on the likelihood-ratio estimator by reducing its variance using a control variate based on the first-order Taylor expansion of a mean-field network.

### Techniques for Learning Binary Stochastic Feedforward Neural Networks

• Computer Science
ICLR
• 2015
This work confirms that training stochastic networks is difficult and proposes two new estimators that perform favorably among all the five known estimators, and proposes benchmark tests for comparing training algorithms.

### Event-driven contrastive divergence for spiking neuromorphic systems

• Computer Science
Front. Neurosci.
• 2014
This work presents an event-driven variation of CD to train a RBM constructed with Integrate & Fire neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms, and contributes to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.

### Event-Driven Random Back-Propagation: Enabling Neuromorphic Deep Learning Machines

• Computer Science
Front. Neurosci.
• 2017
An event-driven random backpropagation (eRBP) rule is demonstrated that uses an error-modulated synaptic plasticity rule for learning deep representations in neuromorphic computing hardware, achieving nearly identical classification accuracies compared to artificial neural network simulations on GPUs, while being robust to neural and synaptic state quantizations during learning.

### Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes

• Computer Science
ICLR
• 2017
A unified view of normalization techniques, as forms of divisive normalization, is proposed, which includes layer and batch normalization as special cases, and the finding that a small modification to these normalization schemes, in conjunction with a sparse regularizer on the activations, leads to significant benefits over standardnormalization techniques.

### Learning Discrete Weights Using the Local Reparameterization Trick

• Computer Science
ICLR
• 2018
This work introduces LR-nets (Local reparameterization networks), a new method for training neural networks with discrete weights using stochastic parameters and shows how a simple modification to the local reparametersization trick enables the training of discrete weights.