• Publications
  • Influence
From opencl to high-performance hardware on FPGAS
TLDR
It is shown that the OpenCL computing paradigm is a viable design entry method for high-performance computing applications on FPGAs and that it can achieve a clock frequency in excess of 160MHz on benchmarks.
VESPA: portable, scalable, and flexible FPGA-based vector processors
TLDR
A system of vectorized software and soft vector processor hardware that is portable to any FPGA architecture and vector processor configuration, scalable to larger yet higher-performance designs, and flexible, allowing the underlying vector processor to be customized to match the needs of each application.
The microarchitecture of FPGA-based soft processors
TLDR
An infrastructure for rapidly generating RTL models of soft processors, as well as a methodology for measuring their area, performance, and power, are presented.
Exploration and Customization of FPGA-Based Soft Processors
TLDR
An exploration of the microarchitectural tradeoffs for soft processors and a set of customization techniques that capitalizes on these tradeoffs to improve the efficiency of soft processors for specific applications are provided.
An FPGA-based Pentium® in a complete desktop system
TLDR
This work uses a FPGA-based emulation system to conduct preliminary architectural experiments including growing the branch target buffer and the level 1 caches and experimented with interfacing hardware accelerators such as DES and AES engines which resulted in 27x speedups.
OpenCL for FPGAs: Prototyping a Compiler
TLDR
This paper presents a framework to support OpenCL compilation to FPGAs, and presents the compilation flow and the results on a set of benchmarks that show the effectiveness of the automated compiler.
Application-specific customization of soft processor microarchitecture
TLDR
It is found that the processor design that is fastest overall is often also the fastest design for an individual application, and that a processor customized to support only that subset of the ISA for a specific application can on average offer 25% savings in both area and energy.
Portable, Flexible, and Scalable Soft Vector Processors
TLDR
This work proposes extending soft processors with vector extensions to exploit the abundant data parallelism found in many embedded kernels to execute these kernels much faster than a single-core hence reducing the need for hardware implementations.
A parameterized automatic cache generator for FPGAs
TLDR
A cache generator which can produce caches with a variety of associativities, latencies, and dimensions is presented which allows system designers to effortlessly create, and investigate different caches in order to better meet the needs of their target system.
Fine-grain performance scaling of soft vector processors
TLDR
This work adds support for vector chaining and heterogeneous vector lanes, allowing the soft vector processor to be customized to not only the data-level parallelism available in an application, but to the functional unit demand.
...
1
2
3
...