ECI: a Customizable Cache Coherency Stack for Hybrid FPGA-CPU Architectures
@article{Ramdas2022ECIAC, title={ECI: a Customizable Cache Coherency Stack for Hybrid FPGA-CPU Architectures}, author={Abishek Ramdas and Michael J. Giardino and Runbin Shi and Adam Turowski and David A. Cock and Gustavo Alonso and Timothy Roscoe}, journal={ArXiv}, year={2022}, volume={abs/2208.07124} }
Unlike other accelerators, FPGAs are capable of supporting cache coherency, thereby turning them into a more powerful architectural option than just a peripheral accelerator. However, most existing deployments of FPGAs are either non-cache coherent or support only an asymmetric design where cache coherency is controlled from the CPU. Taking advantage of a recently released two socket CPU-FPGA architecture, in this paper we describe A Customizable Caching Interface (ACCI), a flexible…
Figures and Tables from this paper
References
SHOWING 1-10 OF 48 REFERENCES
Enzian: an open, general, CPU/FPGA platform for systems software research
- Computer ScienceASPLOS
- 2022
It is shown that a research group can design and build a more general, open, and affordable hardware platform for hybrid systems research, and Enzian is capable of duplicating the functionality of existing CPU/FPGA systems with comparable performance but in an open, flexible system.
NoC-Based Support of Heterogeneous Cache-Coherence Models for Accelerators
- Computer Science2018 Twelfth IEEE/ACM International Symposium on Networks-on-Chip (NOCS)
- 2018
This work proposes an extension of a standard directory-based cache-coherence protocol and presents its design as part of a scalable memory hierarchy implemented over a NoC, and designed a many-accelerator SoC architecture that can support three main cache- coherence models for accelerators: non-coherent, last-level-cache- coherent, and fully-co coherent.
CAPI: A Coherent Accelerator Processor Interface
- Computer ScienceIBM J. Res. Dev.
- 2015
The Coherent Accelerator Processor Interface (CAPI) is enabled, which enables attaching an accelerator as a coherent CPU peer over the I/O physical interface, and greatly increases the opportunities for acceleration due to the much shorter software path length required to enable its use compared to a traditional I/W model.
Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems
- Computer Science2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
- 2009
A set of sophisticated benchmarks for latency and bandwidth measurements to arbitrary locations in the memory subsystem are presented and the coherency state of cache lines are considered to analyze the cache co herency protocols and their performance impact.
Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ
- Computer ScienceFPGAworld
- 2013
This is the first work which represents detailed practical comparisons on the speed and energy efficiency of various processor-accelerator memory sharing techniques in a configurable heterogeneous platform.
Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures
- Computer ScienceSIGMOD Conference
- 2017
This work integrates the hardware accelerator into MonetDB, a main-memory column store, and demonstrates a significant improvement in response time and throughput, and provides a novel and efficient implementation of two commonly used SQL operators for strings.
Exploring Portability and Performance of OpenCL FPGA Kernels on Intel HARPv2
- Computer ScienceIWOCL
- 2019
This work targets the second iteration of the HARPv2 platform using HLS through porting of OpenCL kernels originally written for FPGAs connected via a PCIe bus, and explores the portability of kernels through a hardware design space search, and empirically shows the benefits of using the shared virtual memory (SVM) abstraction over explicit reads and writes.
Project PBerry: FPGA Acceleration for Remote Memory
- Computer ScienceHotOS
- 2019
This approach uses emerging cache-coherent FPGAs to expose cache coherence events to the operating system and enables other use cases, such as live virtual machine migration, unified virtual memory, security and code analysis, which open up many promising research directions.
CoNDA: Efficient Cache Coherence Support for Near-Data Accelerators
- Computer Science2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA)
- 2019
CoNDA is proposed, a coherence mechanism that lets an NDA optimistically execute an Nda kernel, under the assumption that the NDA has all necessary coherence permissions, and allows CoNDA to gather information on the memory accesses performed by the Nda and by the rest of the system.
IBM POWER9 opens up a new era of acceleration enablement: OpenCAPI
- Computer ScienceIBM J. Res. Dev.
- 2018
Open Coherent Accelerator Processor Interface (OpenCAPI) is a new industry-standard device interface that enables the development of host-agnostic devices that can coherently connect to any host…