High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme

@article{Wu2020HighSpeedPC,
  title={High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme},
  author={Yi-Lin Wu and Yi Lu and Juinn-Dar Huang},
  journal={2020 IEEE International Symposium on Circuits and Systems (ISCAS)},
  year={2020},
  pages={1-5},
  url={https://api.semanticscholar.org/CorpusID:224908147}
}
This paper proposes a high-speed power-efficient convolver architecture for CNN acceleration that features a globally delay-optimized partial product reduction tree and a depth-first compression scheme for both area and power minimization.

Energy-Efficient High-Speed ASIC Implementation of Convolutional Neural Network Using Novel Reduced Critical-Path Design

This paper proposes a hardware-efficient, high-speed convolution block for ASIC implementation of the CNN algorithm using a novel bit-level-multiply-accumulator (BLMAC) with a modified Booth encoder and a Wallace reduction tree.

An SoC Integration Ready VLIW-Driven CNN Accelerator with High Utilization and Scalability

A highly scalable VLIW-driven CNN accelerator architecture that has enabled a real-time image semantic segmentation system for autonomous driving on an FPGA system and is ready for efficient and easy SoC integration.

GOMARL: Global Optimization of Multiplier using Multi-Agent Reinforcement Learning

    Yi FengChao Wang
    Computer Science, Engineering
  • 2024
A multi-agent reinforcement learning (MA-RL) based framework is proposed, in which two agents cooperate with each other to achieve the overall optimization of multiplier in terms of area and delay, which shows that the multipliers optimized by GOMARL can improve delay by more than 7% and area by more than 5% compared with baseline designs.