Learn More
This paper from the Berkeley BRASS group was all about performance: Is it possible to design an FPGA architecture that can compete with processors and ASICs in terms of clock frequency? FPGAs were (and still are) running at 5x – 10x slower clock frequency, largely due to the effect of configurability on both logic and interconnect delay. Von Herzen’s [1997](More)
Reconfigurable systems can offer the high spatial parallelism and fine-grained, bit-level resource control traditionally associated with hardware implementations, along with the flexibility and adaptability characteristic of software. While reconfigurable systems create new opportunities for engineering and delivering high-performance programmable systems,(More)
Steady state dendritic cells (DC) found in non-lymphoid tissue sites under normal physiologic conditions play a pivotal role in triggering T cell responses upon immune provocation. CD11b+ and CD103+ DC have received considerable attention in this regard. However, still unknown is whether such CD11b+ and CD103+ DC even exist in the ocular mucosa, and if so,(More)
A primary impediment to wide-spread exploitation of reconfigurable computing is the lack of a unifying computational model which allows application portability and longevity without sacrificing a substantial fraction of the raw capabilities. We introduce SCORE (Stream Computation Organized for Reconfigurable Execution), a streambased compute model which(More)
The SCORE compute model uses fixed-size, virtual compute and memory pages connected by stream links to capture the definition of a computation abstracted from the detailed size of the physical hardware. When the number of physical compute pages is smaller than the number of virtual compute pages in the abstract computation graph, the design is(More)
FPGA place and route is time consuming, often serving as the major obstacle inhibiting a fast edit-compile-test loop in prototyping and development and the major obstacle preventing late-bound hardware and design mapping for reconfigurable systems. Previous work showed that hardware-assisted routing can accelerate fanout-free routing on Fat-Trees by three(More)
To fully realize the benefits of partial and rapid reconfiguration of field-programmable devices, we often need to dynamically schedule computing tasks and generate instance-specific configurations—new graphs which must be routed during program execution. Consequently, route time can be a significant overhead cost reducing the achievable net benefits of(More)
Current-generation Deep Neural Networks (DNNs), such as AlexNet and VGG, rely heavily on dense floating-point matrix multiplication (GEMM), which maps well to GPUs (regular parallelism, high TFLOP/s). Because of this, GPUs are widely used for accelerating DNNs. Current FPGAs offer superior energy efficiency (Ops/Watt), but they do not offer the performance(More)
FPGA placement and routing is time consuming, often serving as the major obstacle inhibiting a fast edit-compile-test loop in prototyping and development and the major obstacle preventing late-bound hardware and design mapping for reconfigurable systems. We introduce a stochastic search scheme which can achieve comparable route quality to traditional,(More)