• Corpus ID: 649645

Fixed Point Quantization of Deep Convolutional Networks

@inproceedings{Lin2016FixedPQ,
  title={Fixed Point Quantization of Deep Convolutional Networks},
  author={Darryl Dexu Lin and Sachin S. Talathi and V. Sreekanth Annapureddy},
  booktitle={ICML},
  year={2016}
}
In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. [] Key Result In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.
Space Efficient Quantization for Deep Convolutional Neural Networks
TLDR
This article proposes a space efficient quantization scheme which uses eight or less bits to represent the original 32-bit weights and adopts singular value decomposition (SVD) method to decrease the parameter size of fully-connected layers for further compression.
Towards Lower Bit Multiplication for Convolutional Neural Network Training
TLDR
This paper proposes a fixed-point training framework, in order to reduce the data bit-width for the convolution multiplications, and proposes two constrained group-wise scaling methods that can be implemented with low hardware cost.
IFQ-Net: Integrated Fixed-Point Quantization Networks for Embedded Vision
TLDR
A fixed-point network for embedded vision tasks is proposed through converting the floating-point data in a quantization network intoFixed-point through an integrated conversion of convolution, batch normalization and quantization layers to overcome the data loss caused by the conversion.
Fixed-point Quantization of Convolutional Neural Networks for Quantized Inference on Embedded Platforms
TLDR
This paper proposes a method to optimally quantize the weights, biases and activations of each layer of a pre-trained CNN while controlling the loss in inference accuracy to enable quantized inference and gives a low precision CNN with accuracy losses of less than 1%.
KCNN: Kernel-wise Quantization to Remarkably Decrease Multiplications in Convolutional Neural Network
TLDR
This paper quantizes the floating-point weights in each kernel separately to multiple bit planes to remarkably decrease multiplications and proposes dual normalization to solve the pathological curvature problem during fine-tuning.
Deep Convolutional Neural Network Inference with Floating-point Weights and Fixed-point Activations
TLDR
It is shown that using floating-point numbers for weights is more efficient than fixed-point representation for the same bit-width and enables compact hardware multiply-and-accumulate (MAC) unit design.
Unsupervised Network Quantization via Fixed-Point Factorization
TLDR
This article proposes an efficient framework, namely, fixed-point factorized network (FFN), to turn all weights into ternary values, i.e., {−1, 0, 1}, and highlights that the proposed FFN framework can achieve negligible degradation even without any supervised retraining on the labeled data.
Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks
TLDR
A normalization-oriented 8-bit floating-point quantization oriented processor, named Phoenix, is proposed to reduce storage and memory access with negligible accuracy loss and a hardware processor is designed to address the hardware inefficiency caused byfloating-point multiplier.
Rethinking floating point for deep learning
TLDR
This work improves floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson's posit format.
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
TLDR
This work presents F8Net, a novel quantization framework consisting of only fixed-point 8-bit multiplication, which achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.
...
...

References

SHOWING 1-10 OF 34 REFERENCES
Fixed point optimization of deep convolutional neural networks for object recognition
TLDR
The results indicate that quantization induces sparsity in the network which reduces the effective number of network parameters and improves generalization, and reduces the required memory storage by a factor of 1/10 and achieves better classification results than the high precision networks.
Compressing Deep Convolutional Networks using Vector Quantization
TLDR
This paper is able to achieve 16-24 times compression of the network with only 1% loss of classification accuracy using the state-of-the-art CNN, and finds in terms of compressing the most storage demanding dense connected layers, vector quantization methods have a clear gain over existing matrix factorization methods.
A Deep Neural Network Compression Pipeline: Pruning, Quantization, Huffman Encoding
TLDR
A three stage pipeline: pruning, quantization and Huffman encoding, that work together to reduce the storage requirement of neural networks by 35× to 49× without affecting their accuracy is introduced.
Overcoming Challenges in Fixed Point Training of Deep Convolutional Networks
TLDR
This work is an attempt to draw a theoretical connection between low numerical precision and training algorithm stability, and propose and verify through experiments methods that are able to improve the training performance of deep convolutional networks in fixed point.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
TLDR
This work introduces "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.
Deep Learning with Limited Numerical Precision
TLDR
The results show that deep networks can be trained using only 16-bit wide fixed-point number representation when using stochastic rounding, and incur little to no degradation in the classification accuracy.
Improving the speed of neural networks on CPUs
TLDR
This paper uses speech recognition as an example task, and shows that a real-time hybrid hidden Markov model / neural network (HMM/NN) large vocabulary system can be built with a 10× speedup over an unoptimized baseline and a 4× speed up over an aggressively optimized floating-point baseline at no cost in accuracy.
Low precision arithmetic for deep learning
TLDR
It is found that very low precision computation is sufficient not just for running trained networks but also for training them.
Training deep neural networks with low precision multiplications
TLDR
It is found that very low precision is sufficient not just for running trained networks but also for training them, and it is possible to train Maxout networks with 10 bits multiplications.
...
...