Anupam Chattopadhyay

Learn More
Nowadays, Architecture Description Languages (ADLs) are getting popular to speed up the development of complex SoC design, by performing the design space exploration in a higher level of abstraction. This increase in the abstraction level traditionally comes at the cost of low performance of the final Application Specific Instructionset Processor (ASIP)(More)
In terms of energy and flexibility, Coarse-Grained Reconfigurable Architectures (CGRA) are proven to be advantageous over fine-grained architectures, massively parallel GPUs and generic CPUs. However the key challenge of programmability is preventing wide-spread adoption. To exploit instruction level parallelism inherent to such architectures, optimal(More)
Architecture description languages are widely used to perform architecture exploration for application-driven designs, whereas the RT-level is the commonly accepted level for hardware implementation. For this reason, design parameters such as timing, area or power consumption cannot be taken into consideration accurately during design space exploration.(More)
Architecture Description Languages (ADLs) are widely used to perform design space exploration for Application Specific Instruction Set Processors (ASIPs). While the design space exploration is well supported by numerous tools providing high flexibility and quality, the methodology of automated implementation is limited to simple transformations. Assuming(More)
Cryptographic coprocessors are inherent part of modern System-on-Chips. It serves dual purpose - efficient execution of cryptographic kernels and supporting protocols for preventing IP-piracy. Flexibility in such coprocessors is required to provide protection against emerging cryptanalytic schemes and to support different cryptographic functions like(More)
RC4 is the most popular stream cipher in the domain of cryptology. In this paper, we present a systematic study of the hardware implementation of RC4, and propose the fastest known architecture for the cipher. We combine the ideas of hardware pipeline and loop unrolling to design an architecture that produces 2 RC4 keystream bytes per clock cycle. We have(More)
The conflicting requirements of performance and flexibility in today's embedded system market are forcing system designers to use more and more of the so called Configurable or Customizable processor cores. Such processors tend to meet the demanding performance constraints by accommodating application specific Instruction-Set Extensions (ISEs) which have,(More)
Static timing analysis provides the basis for setting the clock period of a microprocessor core, based on its worst-case critical path. However, depending on the design, this critical path is not always excited and therefore dynamic timing margins exist that can theoretically be exploited for the benefit of better speed or lower power consumption (through(More)