• Publications
  • Influence
From software to accelerators with LegUp high-level synthesis
TLDR
This paper presents on overview of the LegUp design methodology and system architecture, and discusses ongoing work on profiling, hardware/software partitioning. Expand
  • 45
  • 4
  • PDF
The Effect of Compiler Optimizations on High-Level Synthesis for FPGAs
TLDR
We consider the impact of compiler optimizations on the quality of high-level synthesis (HLS)-generated FPGA hardware. Expand
  • 50
  • 2
  • PDF
FPGA-based CNN inference accelerator synthesized from multi-threaded C software
TLDR
A deep-learning inference accelerator is synthesized from a C-language software program parallelized with Pthreads. Expand
  • 26
  • 2
  • PDF
The Effect of Compiler Optimizations on High-Level Synthesis-Generated Hardware
TLDR
We consider the impact of compiler optimizations on the quality of high-level synthesis (HLS)-generated field-programmable gate array (FPGA) hardware. Expand
  • 37
  • 1
  • PDF
Automating the Design of Processor/Accelerator Embedded Systems with LegUp High-Level Synthesis
TLDR
In this paper, we overview the LegUp framework and describe several recent developments: 1) support for an embedded ARM processor, as is available on Altera's recently released SoC FPGA, 2) HLS support for software parallelization schemes -- pthreads and OpenMP, 3) enhancements to LegUp's core HLS algorithms that raise the quality of the auto-generated hardware, and, 4) a preliminary debugging and verification framework. Expand
  • 27
  • 1
  • PDF
LegUp High-Level Synthesis
TLDR
LegUp is a High-level Synthesis tool under active development at the University of Toronto since 2011. Expand
  • 11
Accelerating Memcached on AWS Cloud FPGAs
TLDR
In recent years, FPGAs have been deployed in data centres of major cloud service providers, such as Microsoft [1], Amazon [2], Alibaba [3], Tencent [4], Huawei [5], and Nimbix [6]. Expand
  • 5
  • PDF
A unified software approach to specify pipeline and spatial parallelism in FPGA hardware
TLDR
We use the producer-consumer pattern, commonly used in multi-threaded programming, to infer the generation of hardware that can exploit both pipeline and spatial parallelism on FPGAs. Expand
  • 4
  • PDF
A framework for FPGA-based acceleration of neural network inference with limited numerical precision via high-level synthesis with streaming functionality
TLDR
A Framework for FPGA-Based Acceleration of Neural Network Inference with Limited Numerical Precision via High-Level Synthesis with Streaming Functionality Ruo Long Lian Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto 2016 Deep neural networks are achieving state-of-the-art performance in many artificial intelligence tasks, such as computer vision and speech recognition. Expand
  • 3
  • PDF
From C to Blokus Duo with LegUp high-level synthesis
TLDR
We apply high-level synthesis (HLS) to generate Blokus Duo game-playing hardware for the FPT 2013 Design Competition. Expand
  • 6
  • PDF