Distribution Adaptive INT8 Quantization for Training CNNs
@article{Zhao2021DistributionAI, title={Distribution Adaptive INT8 Quantization for Training CNNs}, author={Kang Zhao and Sida Huang and Pan Pan and Yinghan Li and Yingya Zhang and Zhenyu Gu and Yinghui Xu}, journal={ArXiv}, year={2021}, volume={abs/2102.04782} }
Researches have demonstrated that low bit-width (e.g., INT8) quantization can be employed to accelerate the inference process. It makes the gradient quantization very promising since the backward propagation requires approximately twice more computation than forward one. Due to the variability and uncertainty of gradient distribution, a lot of methods have been proposed to attain training stability. However, most of them ignore the channel-wise gradient distributions and the impact of gradients…
Figures and Tables from this paper
15 Citations
Rethinking the Importance of Quantization Bias, Toward Full Low-Bit Training
- Computer ScienceIEEE Transactions on Image Processing
- 2022
This is the first work to quantize gradients of all layers to 8 bits in both large-scale CNN and RNN training with negligible accuracy loss, and proposes a novel adaptive piecewise quantization method to effectively limit the bias of gradient quantization noise.
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
- Computer Science
- 2022
This work presents F8Net, a novel quantization framework consisting of only fixed-point 8-bit multiplication, which achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.
Is Integer Arithmetic Enough for Deep Learning Training?
- Computer ScienceArXiv
- 2022
The novel training method forms a fully integer training pipeline that does not change the trajectory of the loss and accuracy compared to floating-point, nor does it need any special hyper-parameter tuning, distribution adjustment, or gradient clipping.
You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding
- Computer ScienceECCV
- 2022
This paper innovatively proposes to employ the stochastic property of DNN training process itself and directly extract random numbers from DNNs in a self-sufficient manner and evaluates the quality of the extracted random numbers to find that high-quality random numbers widely exist in DNN's, while their quality can even pass the NIST test suite.
On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats
- Computer Science
- 2023
Both deterministic and stochastic analysis of the SGD algorithm are presented, obtaining bounds that show the effect of number format, which can provide guidelines as to how SGD convergence is affected when constraints render the possibility of performing high-precision computations remote.
Exploiting the Partly Scratch-off Lottery Ticket for Quantization-Aware Training
- Computer ScienceArXiv
- 2022
A heuristic method is developed, dubbed as lottery ticket scratcher (LTS), which freezes a weight once the distance between the full-precision one and its quantization level is smaller than a controllable threshold, which typically eliminates 30%-60% weight updating and 15%-30% FLOPs of the backward pass.
FAT: An In-Memory Accelerator with Fast Addition for Ternary Weight Neural Networks
- Computer ScienceIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
- 2022
A Sparse Addition Control Unit is proposed, which utilizes the sparsity of TWNs to skip the null operations on zero weights and a fast addition scheme based on the memory Sense Amplifier to avoid the time overhead of both carry propagation and writing back the carry to memory cells is proposed.
Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies
- Computer ScienceECCV
- 2022
This work proposes a binary multi-layer perceptron (MLP) block as an alternative to binary convolution blocks to directly model contextual dependencies and builds the BNNs with explicit Contextual De-pendency modeling, termed as BCDNet.
TAB: Unified and Optimized Ternary, Binary, and Mixed-precision Neural Network Inference on the Edge
- Computer ScienceACM Trans. Embed. Comput. Syst.
- 2022
TAB includes unified value representation, efficient data storage scheme and novel bitwise dot product pipelines on CPU/GPU platforms, and introduces a bitwidth-last data format that stores the first and second bits of the ternary values separately to remove the bit extraction overhead.
Bitwidth Heterogeneous Federated Learning with Progressive Weight Dequantization
- Computer ScienceICML
- 2022
This work introduces a pragmatic FL scenario with bitwidth heterogeneity across the participating devices, dubbed as Bitwidth Heterogeneous Federated Learning (BHFL), and proposes ProWD framework, which has a trainable weight dequantizer at the central server that progressively reconstructs the low-bitwidth weights into higher bitwidth weights, and into full-precision weights.
References
SHOWING 1-10 OF 43 REFERENCES
Towards Unified INT8 Training for Convolutional Neural Network
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
An attempt to build a unified 8-bit (INT8) training framework for common convolutional neural networks from the aspects of both accuracy and speed is given and two universal techniques are proposed that reduce the direction deviation of gradients and avoid illegal gradient update along the wrong direction.
Accurate and Efficient 2-bit Quantized Neural Networks
- Computer ScienceMLSys
- 2019
Novel techniques that individually target weight and activation quantizations resulting in an overall quantized neural network (QNN) are proposed that achieves state-of-the-art classification accuracy (comparable to full precision networks) across a range of popular models and datasets.
Fixed-Point Back-Propagation Training
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
By keeping the data distribution stable through a layer-wise precision-adaptive quantization, this paper is able to directly train deep neural networks using low bit-width fixed-point data and achieve guaranteed accuracy, without changing hyper parameters.
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks
- Computer ScienceECCV
- 2018
This work proposes to jointly train a quantized, bit-operation-compatible DNN and its associated quantizers, as opposed to using fixed, handcrafted quantization schemes such as uniform or logarithmic quantization, to address the gap in prediction accuracy between the quantized model and the full-precision model.
Data-Free Quantization Through Weight Equalization and Bias Correction
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer…
Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks
- Computer ScienceMLSys
- 2020
The proposed method of training quantization thresholds (TQT) for uniform symmetric quantizers using standard backpropagation and gradient descent is able to achieve near-floating-point accuracy on traditionally difficult networks such as MobileNets with less than 5 epochs of quantized (8-bit) retraining.
Deep Learning with Low Precision by Half-Wave Gaussian Quantization
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
An half-wave Gaussian quantizer (HWGQ) is proposed for forward approximation and shown to have efficient implementation, by exploiting the statistics of of network activations and batch normalization operations, and to achieve much closer performance to full precision networks than previously available low-precision networks.
Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network Using Truncated Gaussian Approximation
- Computer Science2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2019
This work is the first to incorporate the thresholds of weight ternarization into a closed-form representation using truncated Gaussian approximation, enabling simultaneous optimization of weights and quantizer through back-propagation training.
Towards Effective Low-Bitwidth Convolutional Neural Networks
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations by proposing a two-stage optimization strategy to progressively find good local minima and adopting a novel learning scheme to jointly train a full- Precision model alongside the low-Precision one.
Two-Step Quantization for Low-bit Neural Networks
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
A simple yet effective Two-Step Quantization (TSQ) framework is proposed, by decomposing the network quantization problem into two steps: code learning and transformation function learning based on the learned codes, and the sparse quantization method for code learning.