Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth

@article{Sano2014MultiFPGAAF,
  title={Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth},
  author={Kentaro Sano and Yoshiaki Hatsuda and Satoru Yamamoto},
  journal={IEEE Transactions on Parallel and Distributed Systems},
  year={2014},
  volume={25},
  pages={695-705}
}
Stencil computation is one of the important kernels in scientific computations. However, sustained performance is limited owing to restriction on memory bandwidth, especially on multicore microprocessors and graphics processing units (GPUs) because of their small operational intensity. In this paper, we present a custom computing machine (CCM), called a scalable streaming-array (SSA), for high-performance stencil computations with multiple field-programmable gate arrays (FPGAs). We design SSA… CONTINUE READING

Similar Papers

Citations

Publications citing this paper.
SHOWING 1-10 OF 45 CITATIONS

OpenCL-Based FPGA-Platform for Stencil Computation and Its Optimization Methodology

  • IEEE Transactions on Parallel and Distributed Systems
  • 2017
VIEW 7 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FPGA-based Custom Computing Architecture for Large-Scale Fluid Simulation with Building Cube Method

  • SIGARCH Computer Architecture News
  • 2014
VIEW 6 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

FPGA-Based Scalable and Power-Efficient Fluid Simulation using Floating-Point DSP Blocks

  • IEEE Transactions on Parallel and Distributed Systems
  • 2017
VIEW 2 EXCERPTS
HIGHLY INFLUENCED

2D Stencil Computation on Cyclone V SoC FPGA using OpenCL

  • 2018 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET)
  • 2018
VIEW 1 EXCERPT
CITES METHODS

From Tensor Algebra to Hardware Accelerators: Generating Streaming Architectures for Solving Partial Differential Equations

  • 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
  • 2018
VIEW 1 EXCERPT
CITES BACKGROUND

References

Publications referenced by this paper.
SHOWING 1-10 OF 21 REFERENCES

Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth

  • 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines
  • 2011
VIEW 2 EXCERPTS

Implementing the Himeno benchmark with CUDA on GPU clusters

  • 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)
  • 2010

Local-and-Global Stall Mechanism for Systolic Computational-Memory Array on Extensible Multi-FPGA System

W. Luzhou, K. Sano, S. Yamamoto
  • Proc. Int’l Conf. Field-Programmable Technology, pp. 102-109, Dec. 2010.
  • 2010
VIEW 1 EXCERPT

Prototype Implementation of Array-Processor Extensible over Multiple FPGAs for Scalable Stencil Computation

K. Sano, W. Luzhou, S. Yamamoto
  • ACM SIGARCH Computer Architecture News, vol. 38, no. 4, pp. 80-86, Dec. 2010.
  • 2010
VIEW 1 EXCERPT