• Corpus ID: 231698498

CPT: Efficient Deep Neural Network Training via Cyclic Precision

  title={CPT: Efficient Deep Neural Network Training via Cyclic Precision},
  author={Y. Fu and Han Guo and Meng Li and Xin Yang and Yining Ding and Vikas Chandra and Yingyan Lin},
Low-precision deep neural network (DNN) training has gained tremendous attention as reducing precision is one of the most effective knobs for boosting DNNs' training time/energy efficiency. In this paper, we attempt to explore low-precision training from a new perspective as inspired by recent findings in understanding DNN training: we conjecture that DNNs' precision might have a similar effect as the learning rate during DNN training, and advocate dynamic precision along the training… 
Double-Win Quant: Aggressively Winning Robustness of Quantized Deep Neural Networks via Random Precision Training and Inference
It is identified that when an adversarially trained model is quantized to different precisions in a post-training manner, the associated adversarial attacks transfer poorly between different precision, enabling an aggressive “win-win” in terms of DNNs’ robustness and efficiency.
n-hot: Efficient bit-level sparsity for powers-of-two neural network quantization
This work proposes an efficient PoT quantization scheme that suppresses the accuracy drop by 0.3% at most while reducing the number of operations by about 75% and model size by 11.5% compared to the uniform method.
2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency
A Random Precision Switch algorithm that can effectively defend DNNs against adversarial attacks by enabling random DNN quantization as an in-situ model switch during training and inference is proposed and an integrated algorithm-accelerator co-design framework aiming at winning both the adversarial robustness and efficiency of DNN accelerators is proposed.
Advances in Classifying the Stages of Diabetic Retinopathy Using Convolutional Neural Networks in Low Memory Edge Devices
A convolutional neural network model is proposed to detect all the stages of DR on a low-memory edge microcontroller and is found to be highly effective in improving the prognosis accuracy.
InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks
This work proposes InstantNet to automatically generate and deploy instantaneously switchable-precision networks which operates at variable bit-widths and shows that the proposed InstantNet consistently outperforms state-of-the-art designs.
LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference
The proposed LDP is a Learnable Dynamic Precision DNN training framework that can automatically learn a temporally and spatially dynamic precision schedule during training towards optimal accuracy and efficiency trade-offs, and visualize the resulting temporal and spatial precision schedule and distribution of LDP trained DNNs on different tasks.
Service Delay Minimization for Federated Learning over Mobile Devices
The results show that SDEFL reduces notable service delay with a small accuracy drop com- pared to peer designs.
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
This work presents F8Net, a novel quantization framework consisting of only fixed-point 8-bit multiplication, which achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.
Overview frequency principle/spectral bias in deep learning
An overview of the F-Principle is provided and some open problems for future research are proposed that inspire the design of DNN-based algorithms in practical problems, explains experimental phenomena emerging in various scenarios, and further advances the study of deep learning from the frequency perspective.


Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks
A very universal Frequency Principle (F-Principle) --- DNNs often fit target functions from low to high frequencies --- is demonstrated on high-dimensional benchmark datasets such as MNIST/CIFAR10 and deep neural networks such as VGG16.
E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings
This paper attempts to conduct more energy-efficient training of CNNs, so as to enable on-device training, by dropping unnecessary computations from three complementary levels: stochastic mini-batch dropping on the data level; selective layer update on the model level; and sign prediction for low-cost, low-precision back-propagation, on the algorithm level.
Adding Gradient Noise Improves Learning for Very Deep Networks
This paper explores the low-overhead and easy-to-implement optimization technique of adding annealed Gaussian noise to the gradient, which it is found surprisingly effective when training these very deep architectures.
WRPN: Wide Reduced-Precision Networks
This work reduces the precision of activation maps (along with model parameters) and increase the number of filter maps in a layer, and finds that this scheme matches or surpasses the accuracy of the baseline full-precision network.
Scalable Methods for 8-bit Training of Neural Networks
This work is the first to quantize the weights, activations, as well as a substantial volume of the gradients stream, in all layers (including batch normalization) to 8-bit while showing state-of-the-art results over the ImageNet-1K dataset.
Towards Unified INT8 Training for Convolutional Neural Network
An attempt to build a unified 8-bit (INT8) training framework for common convolutional neural networks from the aspects of both accuracy and speed is given and two universal techniques are proposed that reduce the direction deviation of gradients and avoid illegal gradient update along the wrong direction.
Mixed Precision Training
This work introduces a technique to train deep neural networks using half precision floating point numbers, and demonstrates that this approach works for a wide variety of models including convolution neural networks, recurrent neural networks and generative adversarial networks.
Learned Step Size Quantization
This work introduces a novel means to estimate and scale the task loss gradient at each weight and activation layer's quantizer step size, such that it can be learned in conjunction with other network parameters.
Cyclical Learning Rates for Training Neural Networks
  • Leslie N. Smith
  • Computer Science
    2017 IEEE Winter Conference on Applications of Computer Vision (WACV)
  • 2017
A new method for setting the learning rate, named cyclical learning rates, is described, which practically eliminates the need to experimentally find the best values and schedule for the global learning rates.
Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers