Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes
@article{Risso2022ChannelwiseMA, title={Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes}, author={Matteo Risso and Alessio Burrello and Luca Benini and Enrico Macii and Massimo Poncino and Daniele Jahier Pagliari}, journal={2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)}, year={2022}, pages={1-6} }
Quantization is widely employed in both cloud and edge systems to reduce the memory occupation, latency, and energy consumption of deep neural networks. In particular, mixed-precision quantization, i.e., the use of different bit-widths for different portions of the network, has been shown to provide excellent efficiency gains with limited accuracy drops, especially with optimized bit-width assignments determined by automated Neural Architecture Search (NAS) tools. State-of-the-art mixed…