Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
@article{Kung2020TermRF, title={Term Revealing: Furthering Quantization at Run Time on Quantized DNNs}, author={H. T. Kung and Bradley McDanel and S. Zhang}, journal={ArXiv}, year={2020}, volume={abs/2007.06389} }
We present a novel technique, called Term Revealing (TR), for furthering quantization at run time for improved performance of Deep Neural Networks (DNNs) already quantized with conventional quantization methods. TR operates on power-of-two terms in binary expressions of values. In computing a dot-product computation, TR dynamically selects a fixed number of largest terms to use from the values of the two vectors in the dot product. By exploiting normal-like weight and data distributions… CONTINUE READING
Figures, Tables, and Topics from this paper
2 Citations
References
SHOWING 1-10 OF 48 REFERENCES
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
- Computer Science
- J. Mach. Learn. Res.
- 2017
- 965
- PDF
Weighted-Entropy-Based Quantization for Deep Neural Networks
- Computer Science
- 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
- 118
- PDF
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
- Computer Science
- ICLR
- 2016
- 4,196
- PDF
BinaryConnect: Training Deep Neural Networks with binary weights during propagations
- Computer Science, Mathematics
- NIPS
- 2015
- 1,649
- PDF
Exploiting approximate computing for deep learning acceleration
- Computer Science
- 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)
- 2018
- 25
ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers
- Computer Science, Mathematics
- ASPLOS
- 2019
- 56
- PDF