Training Dynamical Binary Neural Networks with Equilibrium Propagation

  title={Training Dynamical Binary Neural Networks with Equilibrium Propagation},
  author={J'er'emie Laydevant and Maxence Ernoult and Damien Querlioz and Julie Grollier},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
Equilibrium Propagation (EP) is an algorithm intrinsically adapted to the training of physical networks, thanks to the local updates of weights given by the internal dynamics of the system. However, the construction of such a hardware requires to make the algorithm compatible with existing neuromorphic CMOS technologies, which generally exploit digital communication between neurons and offer a limited amount of local memory. In this work, we demonstrate that EP can train dynamical networks with… Expand
A comprehensive review of Binary Neural Network
This paper focuses exclusively on 1-bit activations and weights networks, as opposed to previous surveys in which low-bit works are mixed in, and discusses potential directions and future research opportunities for the latest BNN algorithms and techniques. Expand
Spike time displacement based error backpropagation in convolutional spiking neural networks
The evaluation results on the image classification task based on two popular benchmarks, MNIST and FashionMNIST datasets with the accuracies of respectively 99.2% and 92.8%, confirm that this algorithm has been applicable in deep SNNs. Expand


Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing Its Gradient Estimator Bias
This work shows that a bias in the gradient estimate of equilibrium propagation is responsible for this phenomenon and that canceling it allows training deep convolutional neural networks and generalizes Equilibrium Propagation to the case of cross-entropy loss (by opposition to squared error). Expand
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand
Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
Results on scaling up biologically motivated models of deep learning on datasets which need deep networks with appropriate architectures to achieve good performance are presented and implementation details help establish baselines for biologically motivated deep learning schemes going forward. Expand
Equivalent-accuracy accelerated neural-network training using analogue memory
Mixed hardware–software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with ‘polarity inversion’ to cancel out inherent device-to-device variations are demonstrated. Expand
Equilibrium Propagation with Continual Weight Updates
It is proved theoretically that, provided the learning rates are sufficiently small, at each time step of the second phase the dynamics of neurons and synapses follow the gradients of the loss given by BPTT. Expand
Equilibrium Propagation for Memristor-Based Recurrent Neural Networks
Experimental results show that both approaches significantly outperform conventional architectures used for pattern reconstruction and due to the high suitability for VLSI implementation of the equilibrium propagation learning rule, additional results on the classification of the MNIST dataset are here reported. Expand
Training End-to-End Analog Neural Networks with Equilibrium Propagation
It is shown mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models: they possess an energy function as a consequence of Kirchhoff's laws governing electrical circuits. Expand
Backpropagation and the brain
It is argued that the key principles underlying backprop may indeed have a role in brain function and induce neural activities whose differences can be used to locally approximate these signals and hence drive effective learning in deep networks in the brain. Expand
Binarized Neural Networks
A binary matrix multiplication GPU kernel is written with which it is possible to run the MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. Expand
Equivalence of Equilibrium Propagation and Recurrent Backpropagation
This work shows that it is not required to have a side network for the computation of error derivatives and supports the hypothesis that in biological neural networks, temporal derivatives of neural activities may code for error signals. Expand