# Inherent Weight Normalization in Stochastic Neural Networks

@inproceedings{Detorakis2019InherentWN, title={Inherent Weight Normalization in Stochastic Neural Networks}, author={Georgios Detorakis and Sourav Dutta and A. Khanna and Matthew Jerry and Suman Datta and Emre O. Neftci}, booktitle={NeurIPS}, year={2019} }

Multiplicative stochasticity such as Dropout improves the robustness and generalizability of deep neural networks. Here, we further demonstrate that always-on multiplicative stochasticity combined with simple threshold neurons are sufficient operations for deep neural networks. We call such models Neural Sampling Machines (NSM). We find that the probability of activation of the NSM exhibits a self-normalizing property that mirrors Weight Normalization, a previously studied mechanism that…

## 4 Citations

### Neural sampling machine with stochastic synapse allows brain-like learning and inference

- Computer ScienceNature communications
- 2022

This work introduces a novel hardware fabric that can implement a new class of stochastic neural network called Neural Sampling Machine (NSM) by exploiting the stoChasticity in the synaptic connections for approximate Bayesian inference.

### Locally Learned Synaptic Dropout for Complete Bayesian Inference

- Computer Science, Biology
- 2021

This work defines a biologically constrained neural network and sampling scheme based on synaptic failure and lateral inhibition, derives drop-out based epistemic uncertainty, and proves an analytic mapping from synaptic efficacy to release probability that allows networks to sample from arbitrary, learned distributions represented by a receiving layer.

### Autonomous Probabilistic Coprocessing With Petaflips per Second

- Computer ScienceIEEE Access
- 2020

This article explores sequencerless designs where all p-bits are allowed to flip autonomously and demonstrates that such designs can allow ultrafast operation unconstrained by available clock speeds without compromising the solution’s fidelity.

### Supervised Learning in All FeFET-Based Spiking Neural Network: Opportunities and Challenges

- Computer ScienceFrontiers in Neuroscience
- 2020

This work proposes an all FeFET-based SNN hardware that allows low-power spike-based information processing and co-localized memory and computing (a.k.a. in-memory computing), and implements a surrogate gradient (SG) learning algorithm on the SNN platform that allows us to perform supervised learning on MNIST dataset.

## References

SHOWING 1-10 OF 55 REFERENCES

### Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

- Computer ScienceICML
- 2015

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

### Self-Normalizing Neural Networks

- Computer ScienceNIPS
- 2017

Self-normalizing neural networks (SNNs) are introduced to enable high-level abstract representations and it is proved that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero meanand unit variance -- even under the presence of noise and perturbations.

### Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

- Computer ScienceArXiv
- 2013

This work considers a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network.

### Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines

- Computer Science, BiologyFront. Neurosci.
- 2016

The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware.

### MuProp: Unbiased Backpropagation for Stochastic Neural Networks

- Computer ScienceICLR
- 2016

MuProp is presented, an unbiased gradient estimator for stochastic networks, designed to make this task easier by improving on the likelihood-ratio estimator by reducing its variance using a control variate based on the first-order Taylor expansion of a mean-field network.

### Techniques for Learning Binary Stochastic Feedforward Neural Networks

- Computer ScienceICLR
- 2015

This work confirms that training stochastic networks is difficult and proposes two new estimators that perform favorably among all the five known estimators, and proposes benchmark tests for comparing training algorithms.

### Event-driven contrastive divergence for spiking neuromorphic systems

- Computer ScienceFront. Neurosci.
- 2014

This work presents an event-driven variation of CD to train a RBM constructed with Integrate & Fire neurons, that is constrained by the limitations of existing and near future neuromorphic hardware platforms, and contributes to a machine learning-driven approach for synthesizing networks of spiking neurons capable of carrying out practical, high-level functionality.

### Event-Driven Random Back-Propagation: Enabling Neuromorphic Deep Learning Machines

- Computer ScienceFront. Neurosci.
- 2017

An event-driven random backpropagation (eRBP) rule is demonstrated that uses an error-modulated synaptic plasticity rule for learning deep representations in neuromorphic computing hardware, achieving nearly identical classification accuracies compared to artificial neural network simulations on GPUs, while being robust to neural and synaptic state quantizations during learning.

### Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes

- Computer ScienceICLR
- 2017

A unified view of normalization techniques, as forms of divisive normalization, is proposed, which includes layer and batch normalization as special cases, and the finding that a small modification to these normalization schemes, in conjunction with a sparse regularizer on the activations, leads to significant benefits over standardnormalization techniques.

### Learning Discrete Weights Using the Local Reparameterization Trick

- Computer ScienceICLR
- 2018

This work introduces LR-nets (Local reparameterization networks), a new method for training neural networks with discrete weights using stochastic parameters and shows how a simple modification to the local reparametersization trick enables the training of discrete weights.