Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference

@article{Jiang2019AchievingSS,
  title={Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference},
  author={Weiwen Jiang and Edwin Hsing-Mean Sha and Xuezhu Zhang and Russell Enns and Qingfeng Zhuge and Yiyu Shi and Jingtong Hu},
  journal={ArXiv},
  year={2019},
  volume={abs/1907.08985}
}
Real-time Deep Neural Network (DNN) inference with low-latency requirement has become increasingly important for numerous applications in both cloud computing (e.g., Apple's Siri) and edge computing (e.g., Google/Waymo's driverless car). FPGA-based DNN accelerators have demonstrated both superior flexibility and performance; in addition, for real-time inference with low batch size, FPGA is expected to achieve further performance improvement. However, the performance gain from the single-FPGA… CONTINUE READING

References

Publications referenced by this paper.
SHOWING 1-10 OF 38 REFERENCES

You Only Look Once: Unified, Real-Time Object Detection

  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
VIEW 6 EXCERPTS
HIGHLY INFLUENTIAL

A Configurable Cloud-Scale DNN Processor for Real-Time AI

  • 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)
  • 2018
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL

An Efficient Mapping Approach to Large-Scale DNNs on Multi-FPGA Architectures

  • 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)
  • 2019
VIEW 1 EXCERPT

Similar Papers

Loading similar papers…