This poster proposes the use of 3D integration technology to enable low-overhead reconfigurable computing. In our scheme, a 64 Megabyte DRAM array is stacked on top of an FPGA using face-to-face bonding, and caches up to 289 future configurations which can be quickly loaded onto the FPGA. Past DRAMs have been designed for off-chip communication, a bottleneck that 3D stacking eliminates; hence, the DRAM array is redesigned. To reconfigure the FPGA, a configuration is read from the DRAM into a latch array while the FPGA executes; then, the configuration is loaded from the latch array into the FPGA in 5 cycles (60ns). The minimum latency between reconfigurations, 8.42s, is dominated by the time to load data from the DRAM into the latch array. The benefits, area cost, and performance of the proposed system are evaluated on three previously published FPGA implementations of multimedia applications: MP3 and MPEG-4 decoders, and JPEG compression, and are evaluated under three scenarios: No Dynamic ReConfiguration (NDRC), Off-chip Dynamic ReConfiguration (ORDC), and 3D Configuration Caching (3DCC). Our experiments demonstrate that 3D configuration caching works best when used in conjunction with FPGA-based accelerators, rather than pure FPGA-based systems; in these systems, the reconfiguration latency can easily be hidden behind software execution on the processor controlling the accelerator. This significantly reduces the amount of silicon area that must be dedicated to the accelerator, while imposing virtually no performance penalty compared to significantly larger accelerators that do not require reconfiguration.
Unfortunately, ACM prohibits us from displaying non-influential references for this paper.
To see the full reference list, please visit http://dl.acm.org/citation.cfm?id=1508205.