Corpus ID: 220496077

Term Revealing: Furthering Quantization at Run Time on Quantized DNNs

@article{Kung2020TermRF,
  title={Term Revealing: Furthering Quantization at Run Time on Quantized DNNs},
  author={H. T. Kung and Bradley McDanel and S. Zhang},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.06389}
}
  • H. T. Kung, Bradley McDanel, S. Zhang
  • Published 2020
  • Computer Science
  • ArXiv
  • We present a novel technique, called Term Revealing (TR), for furthering quantization at run time for improved performance of Deep Neural Networks (DNNs) already quantized with conventional quantization methods. TR operates on power-of-two terms in binary expressions of values. In computing a dot-product computation, TR dynamically selects a fixed number of largest terms to use from the values of the two vectors in the dot product. By exploiting normal-like weight and data distributions… CONTINUE READING
    2 Citations

    References

    SHOWING 1-10 OF 48 REFERENCES
    Fixed Point Quantization of Deep Convolutional Networks
    • 485
    • PDF
    Weighted-Entropy-Based Quantization for Deep Neural Networks
    • 118
    • PDF
    Quantization for Rapid Deployment of Deep Neural Networks
    • 23
    • PDF
    Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
    • 4,196
    • PDF
    BinaryConnect: Training Deep Neural Networks with binary weights during propagations
    • 1,649
    • PDF
    Exploiting approximate computing for deep learning acceleration
    • 25
    ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers
    • 56
    • PDF
    Trained Ternary Quantization
    • 621
    • PDF