High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme

Yi-Lin Wu; Yi Lu; Juinn-Dar Huang

DOI:10.1109/ISCAS45731.2020.9180406
Corpus ID: 224908147

High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme

@article{Wu2020HighSpeedPC,
  title={High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme},
  author={Yi-Lin Wu and Yi Lu and Juinn-Dar Huang},
  journal={2020 IEEE International Symposium on Circuits and Systems (ISCAS)},
  year={2020},
  pages={1-5},
  url={https://api.semanticscholar.org/CorpusID:224908147}
}

Yi-Lin WuYi LuJuinn-Dar Huang
Published in International Symposium on… 1 October 2020
Computer Science, Engineering

This paper proposes a high-speed power-efficient convolver architecture for CNN acceleration that features a globally delay-optimized partial product reduction tree and a depth-first compression scheme for both area and power minimization.

3 Citations

Energy-Efficient High-Speed ASIC Implementation of Convolutional Neural Network Using Novel Reduced Critical-Path Design

Sun Sik LeeThanh Dat NguyenP. MeherS. Park

Computer Science, Engineering

IEEE Access

2022

This paper proposes a hardware-efficient, high-speed convolution block for ASIC implementation of the CNN algorithm using a novel bit-level-multiply-accumulator (BLMAC) with a modified Booth encoder and a Wallace reduction tree.

An SoC Integration Ready VLIW-Driven CNN Accelerator with High Utilization and Scalability

Chia-Heng HuI-Hao TsengPei-Hsuan KuoJuinn-Dar Huang

Computer Science, Engineering

2022 IEEE 4th International Conference on…

2022

A highly scalable VLIW-driven CNN accelerator architecture that has enabled a real-time image semantic segmentation system for autonomous driving on an FPGA system and is ready for efficient and easy SoC integration.

GOMARL: Global Optimization of Multiplier using Multi-Agent Reinforcement Learning

Yi FengChao Wang

Computer Science, Engineering

2024 2nd International Symposium of Electronics…

2024

A multi-agent reinforcement learning (MA-RL) based framework is proposed, in which two agents cooperate with each other to achieve the overall optimization of multiplier in terms of area and delay, which shows that the multipliers optimized by GOMARL can improve delay by more than 7% and area by more than 5% compared with baseline designs.

High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme

3 Citations

Energy-Efficient High-Speed ASIC Implementation of Convolutional Neural Network Using Novel Reduced Critical-Path Design

An SoC Integration Ready VLIW-Driven CNN Accelerator with High Utilization and Scalability

GOMARL: Global Optimization of Multiplier using Multi-Agent Reinforcement Learning

Related Papers