Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes

@article{Risso2022ChannelwiseMA,
  title={Channel-wise Mixed-precision Assignment for DNN Inference on Constrained Edge Nodes},
  author={Matteo Risso and Alessio Burrello and Luca Benini and Enrico Macii and Massimo Poncino and Daniele Jahier Pagliari},
  journal={2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)},
  year={2022},
  pages={1-6}
}
Quantization is widely employed in both cloud and edge systems to reduce the memory occupation, latency, and energy consumption of deep neural networks. In particular, mixed-precision quantization, i.e., the use of different bit-widths for different portions of the network, has been shown to provide excellent efficiency gains with limited accuracy drops, especially with optimized bit-width assignments determined by automated Neural Architecture Search (NAS) tools. State-of-the-art mixed… 

Figures from this paper