• Publications
  • Influence
Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks
This work presents a systematic design space exploration methodology to maximize the throughput of an OpenCL-based FPGA accelerator for a given CNN model, considering the FPGAs resource constraints such as on-chip memory, registers, computational resources and external memory bandwidth. Expand
Exploring sub-20nm FinFET design with Predictive Technology Models
Predictive MOSFET models are critical for early stage design-technology co-optimization and circuit design research and PTM for FinFET devices are generated for 5 technology nodes corresponding to the years 2012-2020 on the ITRS roadmap. Expand
What is Predictive Technology Model (PTM)?
  • Yu Cao
  • Engineering
  • SIGD
  • 1 March 2009
The minimum feature size of CMOS technology will reach 10nm in ten years. Beyond that benchmark, the present scaling approach may have to take a different route. The grand challenge to integratedExpand
Digital Circuit Design Challenges and Opportunities in the Era of Nanoscale CMOS
New techniques for logic circuits and interconnect, for memory, and for clock and power distribution are discussed, and the role of geometrically regular circuits as one promising solution is discussed. Expand
Fully parallel write/read in resistive synaptic array for accelerating on-chip learning.
A novel fully parallel write scheme is designed and experimentally demonstrated in a small-scale crossbar array to accelerate the weight update in the training process, at a speed that is independent of the array size. Expand
Large-Scale Neuromorphic Spiking Array Processors: A Quest to Mimic the Brain
Some of the most significant neuromorphic spiking emulators are described, the different architectures and approaches used by them are compared, their advantages and drawbacks are illustrated, and the capabilities that each can deliver to neural modelers are highlighted. Expand
Compact Modeling of BTI for Circuit Reliability Analysis
The aging process due to Bias Temperature Instability (BTI) is a key limiting factor of circuit lifetime in contemporary CMOS design. Threshold voltage shift induced by BTI is a strong function ofExpand
A resilience roadmap: (invited paper)
Technology scaling has an increasing impact on the resilience of CMOS circuits. This outcome is the result of (a) increasing sensitivity to various intrinsic and extrinsic noise sources as circuitsExpand
Automatic Compiler Based FPGA Accelerator for CNN Training
This work presents an automatic compiler based FPGA accelerator with 16-bit fixed-point precision for complete CNN training, including Forward Pass (FP), Backward Pass (BP) and Weight Update (WU), and implemented an optimized RTL library to perform training-specific tasks and developed an RTL compiler to automatically generate FPGa-synthesizable RTL based on user-defined constraints. Expand