Learn More
Asynchronous circuits are more and more predominant because their advantages in comparison with synchronous circuits. While asynchronous circuits can be implemented in custom VLSI, their fabricated-time is too long to allow rapid prototyping. Meanwhile, FPGA devices are dominant implementation media for digital circuits. Unfortunately, they do not support(More)
Multicore processing, especially heterogeneous multicore, is being increasingly used for data intensive processing in embedded systems. An important challenge in multicore processing is, efficiently, to get the data to the computing core that needs it. In order to have an efficient interconnect design for multicore architectures, a detailed profiling of(More)
The communication infrastructure is one of the important components of a multicore system along with the computing cores and memories. A good interconnect design plays a key role in improving the performance of such systems. In this paper, we introduce a hybrid communication infrastructure using both the standard bus and our area-efficient and(More)
—In this paper, we present an overview of interconnect solutions for hardware accelerator systems. A number of solutions are presented: bus-based, DMA, crossbar, NoC, as well as combinations of these. The paper proposes analytical models to predict the performance of these solutions and implements them in practice. The jpeg decoder application is(More)
In this paper, we introduce an automated interconnect design strategy to create an efficient custom interconnect for kernels in an FPGA-based accelerator system to accelerate their communication behavior. Our custom interconnect includes an NoC, shared local memory solution or both. Depending on the quantitative communication profiling of the application,(More)
This paper proposes an FPGA-based multicore architecture to integrate multiple DDoS defense mechanisms for DDoS protection. The architecture allows multiple cooperating DDoS mitigation techniques to classify incoming network packets. The proposed architecture consists of two separate partitions static and dynamic. The static partition includes packet(More)
—This paper proposes a heterogeneous hardware accelerator architecture to support streaming image processing. Each image in a data-set is pre-processed on a host processor and sent to hardware kernels. The host processor and the hardware kernels process a stream of images in parallel. The Convey hybrid computing system is used to develop our proposed(More)
—Multicore architectures, especially hardware accelerator systems with heterogeneous processing elements, are being increasingly used due to the increasing processing demand of modern digital systems. However, data communication in multicore architectures is one of the main performance bottlenecks. Therefore, reducing data communication overhead is an(More)
High-level synthesis (HLS) is an automated design process that deals with the generation of behavioral hardware descriptions from high-level algorithmic specifications. The main benefit of this approach is that ever-increasing system-on-chip (SoC) design complexity and ever-shorter time-to-market can still be both manageable and achievable. This advantage,(More)
Although heterogeneous multicore systems are widely used in both academia and industry, system performance of such systems does not scale when increasing the number of processing cores. The main reason is due to the communication overhead which increases greatly with the increasing number of cores. In this paper, we propose an automated design approach to(More)