Kiran Bondalapati

Learn More
Reconfigurable architectures promise significant performance and flexibility advantages over conventional architectures. Automatic mapping techniques that exploit the features of the hardware are needed to leverage the power of these architectures. In this paper, we develop techniques for parallelizing nested loop computations from digital signal(More)
This paper presents efficient multicasting with reduced contention on irregular networks with switch-based wormhole interconnection and unicast message passing. First, it is proved that for an arbitrary irregular network with a typical deadlock-free, adaptive routing, it may not be possible to create an ordered list of nodes to implement an arbitrary(More)
Reconngurable architectures promise signiicant performance beneets by customizing the conngurations to suit the computations. Variable precision for computations is one important method of customization for which reconngurable architectures are well suited. The precision of the operations can be modiied dynamically at run-time to match the precision of the(More)
Lack of automatic mapping techniques is a signiicant hurdle in obtaining high performance for general purpose computing on recon-gurable hardware. In this paper, we develop techniques for mapping loop computations from applications onto high performance pipelined c onng-urations. Loop statements with generalized directed acyclic graph dependencies are(More)
The lack of high-level design tools hampers the widespread adoption of adaptive computing systems. Application developers have t o master a wide range of functions, from the high-level architecture design, to the timing of actual control and data signals. In this paper we describe DEFACTO, an end-to-end design environment aimed at bridging the gap in tools(More)
Effective utilization of cache memories is a key factor in achieving high performance in computing the Discrete Fourier Transform (DFT). Most optimization techniques for computing the DFT rely on either modifying the computation and data access order or exploiting low level platform specific details, while keeping the data layout in memory static. In this(More)