3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low Bitwidth Quantization, and Ultra-Low Latency Acceleration

@article{Chen20213UEdgeAIUM,
  title={3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low Bitwidth Quantization, and Ultra-Low Latency Acceleration},
  author={Yaoxing Chen and Cole Hawkins and Kaiqi Zhang and Zheng Zhang and Cong Hao},
  journal={Proceedings of the 2021 on Great Lakes Symposium on VLSI},
  year={2021}
}
The deep neural network (DNN) based AI applications on the edge require both low-cost computing platforms and high-quality services. However, the limited memory, computing resources, and power budget of the edge devices constrain the effectiveness of the DNN algorithms. Developing edge-oriented AI algorithms and implementations (e.g., accelerators) is challenging. In this paper, we summarize our recent efforts for efficient on-device AI development from three aspects, including both training… Expand

Figures and Tables from this paper

References

SHOWING 1-5 OF 5 REFERENCES
FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
FPGA based implementation of deep neural networks using on-chip memory only
  • Jinhwan Park, Wonyong Sung
  • Computer Science
  • 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2016
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Stochastic variational inference