• Corpus ID: 53389725

Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks

@article{Sakr2019AccumulationBS,
  title={Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks},
  author={Charbel Sakr and Naigang Wang and Chia-Yu Chen and Jungwook Choi and Ankur Agrawal and Naresh R Shanbhag and K. Gopalakrishnan},
  journal={ArXiv},
  year={2019},
  volume={abs/1901.06588}
}
Efforts to reduce the numerical precision of computations in deep learning training have yielded systems that aggressively quantize weights and activations, yet employ wide high-precision accumulators for partial sums in inner-product operations to preserve the quality of convergence. The absence of any framework to analyze the precision requirements of partial sum accumulations results in conservative design choices. This imposes an upper-bound on the reduction of complexity of multiply… 

Figures and Tables from this paper

Ultra-Low Precision 4-bit Training of Deep Neural Networks
TLDR
A novel adaptive Gradient Scaling technique (GradScale) is explored that addresses the challenges of insufficient range and resolution in quantized gradients as well as explores the impact of quantization errors observed during model training.
ULTRA-LOW-PRECISION ARITHMETIC
TLDR
WrapNet is proposed, an architecture that adapts neural networks to use low-precision (8-bit) additions while achieving classification accuracy comparable to their 32bit counterparts, and achieves resilience to low- Precision accumulation by inserting a cyclic activation layer that makes results invariant to overflow.
WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic
TLDR
WrapNet is proposed that adapts neural networks to use low-precision (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts.
Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
—We introduce a software-hardware co-design ap- proach to reduce memory traffic and footprint during training with BFloat16 or FP32 boosting energy efficiency and execution time performance. We
Adaptive Loss Scaling for Mixed Precision Training
TLDR
This work introduces a loss scaling-based training method called adaptive loss scaling that makes MPT easier and more practical to use, by removing the need to tune a model-specific loss scale hyperparameter.
FPRaker: A Processing Element For Accelerating Neural Network Training
TLDR
FPRaker, a processing element for composing training accelerators, processes several floating-point multiply-accumulation operations concurrently and accumulates their result into a higher precision accumulator, which naturally amplifies performance with training methods that use a different precision per layer.
RaPiD: AI Accelerator for Ultra-low Precision Training and Inference
TLDR
This work designed RaPiD1, a 4-core AI accelerator chip supporting a spectrum of precisions, namely, 16 and 8-bit floating-point and 4 and 2-bit fixed-point, and evaluated DNN inference and DNN training for a 768 TFLOPs AI system comprising 4 32-core Ra PiD chips.
Efficient AI System Design With Cross-Layer Approximate Computing
TLDR
RaPiD, a multi-tera operations per second (TOPS) AI hardware accelerator core that is built from the ground-up using AxC techniques across the stack including algorithms, architecture, programmability, and hardware, is presented.
CodeNet: Training Large Scale Neural Networks in Presence of Soft-Errors
TLDR
The experiments show that CodeNet achieves the best accuracy-runtime tradeoff compared to both replication and uncoded strategies, and is a significant step towards biologically plausible neural network training, that could hold the key to orders of magnitude efficiency improvements.
...
...

References

SHOWING 1-10 OF 22 REFERENCES
Fixed Point Quantization of Deep Convolutional Networks
TLDR
This paper proposes a quantizer design for fixed point implementation of DCNs, formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers, and demonstrates that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model.
Mixed Precision Training
TLDR
This work introduces a technique to train deep neural networks using half precision floating point numbers, and demonstrates that this approach works for a wide variety of models including convolution neural networks, recurrent neural networks and generative adversarial networks.
Training and Inference with Integers in Deep Neural Networks
TLDR
Empirically, this work demonstrates the potential to deploy training in hardware systems such as integer-based deep learning accelerators and neuromorphic chips with comparable accuracy and higher energy efficiency, which is crucial to future AI applications in variable scenarios with transfer and continual learning demands.
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
TLDR
This work introduces "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.
Exploiting approximate computing for deep learning acceleration
TLDR
Based on earlier studies demonstrating that DNNs are resilient to numerical errors from approximate computing, techniques to reduce communication overhead of distributed deep learning training via adaptive residual gradient compression (AdaComp), and computation cost for deep learning inference via Prameterized clipping ACTivation (PACT) based network quantization are presented.
Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
TLDR
The results suggest Flexpoint as a promising numerical format for future hardware for training and inference, and demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters.
Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference
TLDR
Sensitivity analysis indicates that simple techniques, coupled with proper activation function range calibration to take full advantage of the limited precision, are sufficient to discover low-precision networks, if they exist, close to fp32 precision baseline networks.
Trained Ternary Quantization
TLDR
This work proposes Trained Ternary Quantization (TTQ), a method that can reduce the precision of weights in neural networks to ternary values to improve the accuracy of some models (32, 44, 56-layer ResNet) on CIFAR-10 and AlexNet on ImageNet.
DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients
TLDR
DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bit width parameter gradients, is proposed and can achieve comparable prediction accuracy as 32-bit counterparts.
Learning both Weights and Connections for Efficient Neural Network
TLDR
A method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections, and prunes redundant connections using a three-step method.
...
...