Corpus ID: 1518846

BinaryConnect: Training Deep Neural Networks with binary weights during propagations

@inproceedings{Courbariaux2015BinaryConnectTD,
  title={BinaryConnect: Training Deep Neural Networks with binary weights during propagations},
  author={Matthieu Courbariaux and Yoshua Bengio and Jean-Pierre David},
  booktitle={NIPS},
  year={2015}
}
Deep Neural Networks (DNN) have achieved state-of-the-art results in a wide range of tasks, with the best results obtained with large training sets and large models. [...] Key Method We introduce BinaryConnect, a method which consists in training a DNN with binary weights during the forward and backward propagations, while retaining precision of the stored weights in which gradients are accumulated. Like other dropout schemes, we show that BinaryConnect acts as regularizer and we obtain near state-of-the-art…Expand
Training and Inference with Integers in Deep Neural Networks
TLDR
Empirically, this work demonstrates the potential to deploy training in hardware systems such as integer-based deep learning accelerators and neuromorphic chips with comparable accuracy and higher energy efficiency, which is crucial to future AI applications in variable scenarios with transfer and continual learning demands. Expand
PXNOR: Perturbative Binary Neural Network
  • Vlad Pelin, I. Radoi
  • Computer Science
  • 2019 18th RoEduNet Conference: Networking in Education and Research (RoEduNet)
  • 2019
TLDR
PXNOR seeks to fully replace traditional convolutional filters with approximate operations, while replacing all multiplications and additions with simpler, much faster versions such as XNOR and bitcounting, which are implemented at hardware level on all existing platforms. Expand
Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
TLDR
A binary matrix multiplication GPU kernel is written with which it is possible to run the MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. Expand
GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework
TLDR
It is found that when both the weights and activations become ternary values, the DNNs can be reduced to sparse binary networks, termed as gated XNOR networks (GXNOR-Nets), which promises the event-driven hardware design for efficient mobile intelligence. Expand
BINARY DEEP NEURAL NETWORKS
There are many applications scenarios for which the computational performance and memory footprint of the prediction phase of Deep Neural Networks (DNNs) need to be optimized. Binary Deep NeuralExpand
Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers
TLDR
A unified complete quantization framework termed as WAGEUBN to quantize DNNs involving all data paths including W (Weights), A (Activation), G (Gradient), E (Error), U (Update), and BN (Update) and the Momentum optimizer is also quantized to realize a completely quantized framework. Expand
Espresso: Efficient Forward Propagation for Binary Deep Neural Networks
TLDR
Espresso provides special convolutional and dense layers for BCNNs, leveraging bit-packing and bitwise computations for efficient execution, and provides a speed-up of matrix-multiplication routines, and at the same time, reduce memory usage when storing parameters and activations. Expand
BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1
TLDR
BinaryNet, a method which trains DNNs with binary weights and activations when computing parameters’ gradient is introduced, which drastically reduces memory usage and replaces most multiplications by 1-bit exclusive-not-or (XNOR) operations, which might have a big impact on both general-purpose and dedicated Deep Learning hardware. Expand
Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation
  • Zhezhi He, Deliang Fan
  • Computer Science, Mathematics
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
This work is the first to incorporate the thresholds of weight ternarization into a closed-form representation using truncated Gaussian approximation, enabling simultaneous optimization of weights and quantizer through back-propagation training. Expand
Sparsely-Connected Neural Networks: Towards Efficient VLSI Implementation of Deep Neural Networks
TLDR
Sparsely-connected neural networks are proposed, by showing that the number of connections in fully-connected networks can be reduced by up to 90% while improving the accuracy performance on three popular datasets while proposing an efficient hardware architecture based on linear-feedback shift registers to reduce the memory requirements of the proposed sparsely- connected networks. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 53 REFERENCES
Training deep neural networks with low precision multiplications
TLDR
It is found that very low precision is sufficient not just for running trained networks but also for training them, and it is possible to train Maxout networks with 10 bits multiplications. Expand
Low precision arithmetic for deep learning
TLDR
It is found that very low precision computation is sufficient not just for running trained networks but also for training them. Expand
ImageNet classification with deep convolutional neural networks
TLDR
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand
DaDianNao: A Machine-Learning Supercomputer
  • Yunji Chen, Tao Luo, +8 authors O. Temam
  • Computer Science
  • 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture
  • 2014
TLDR
This article introduces a custom multi-chip machine-learning architecture, showing that, on a subset of the largest known neural network layers, it is possible to achieve a speedup of 450.65x over a GPU, and reduce the energy by 150.31x on average for a 64-chip system. Expand
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
TLDR
This study designs an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance and energy, and shows that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s in a small footprint. Expand
Fixed-point feedforward deep neural network design using weights +1, 0, and −1
TLDR
The designed fixed-point networks with ternary weights (+1, 0, and -1) and 3-bit signal show only negligible performance loss when compared to the floating-point coun-terparts. Expand
Deep Learning with Limited Numerical Precision
TLDR
The results show that deep networks can be trained using only 16-bit wide fixed-point number representation when using stochastic rounding, and incur little to no degradation in the classification accuracy. Expand
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
TLDR
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand
X1000 real-time phoneme recognition VLSI using feed-forward deep neural networks
TLDR
This work develops a digital VLSI for phoneme recognition using deep neural networks and assess the design in terms of throughput, chip size, and power consumption. Expand
A highly scalable Restricted Boltzmann Machine FPGA implementation
TLDR
This paper describes a novel architecture and FPGA implementation that accelerates the training of general RBMs in a scalable manner, with the goal of producing a system that machine learning researchers can use to investigate ever-larger networks. Expand
...
1
2
3
4
5
...