#### Filter Results:

#### Publication Year

1996

2005

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

– Reconfigurable architectures such as FPGAs are flexible alternatives to DSPs or ASICs used in mobile devices for which energy is a key performance metric. Re-configurable architectures offer several parameters such as operating frequency, precision, amount of memory, number of computation units, etc. These parameters define a large design space that must… (More)

—We develop new algorithms and architectures for matrix multiplication on configurable devices. These have reduced energy dissipation and latency compared with the state-of-the-art field-programmable gate array (FPGA)-based designs. By profiling well-known designs, we identify " energy hot spots, " which are responsible for most of the energy dissipation.… (More)

FPGAs are increasingly being used in the high performance and scientific computing community to implement floating-point based hardware accelerators. In this paper we analyze the floating-point multiplier and adder/subtractor units by considering the number of pipeline stages of the units as a parameter and use through-put/area as the metric. We achieve… (More)

In this paper, we present techniques for energy-efficient design at the algorithm level using FPGAs. We then use these techniques to create energy-efficient designs for two signal processing kernel applications: fast Fourier transform (FFT) and matrix multiplication. We evaluate the performance, in terms of both latency and energy efficiency, of FPGAs in… (More)

In this paper, we first develop a novel architecture for fixed-point LU decomposition of streaming input matrices, on FPGAs. Our architecture, based on a circular linear array , achieves the minimal latency and is resource-efficient. We then extend it, by using a stacked matrices approach, to a floating-point based architecture which achieves the minimal… (More)

We develop new algorithms and architectures for matrix multiplication on configurable hardware. These designs significantly reduce the latency as well as the area. Our designs improve the previous designs in [7] and [1] in terms of the area/speed metric where the speed denotes the maximum achievable running frequency. The area/speed metrics for the

Advances in their technologies have positioned FPGAs and embedded processors to compete with digital signal processors (DSPs). In this paper, we evaluate the performance in terms of both latency and energy-efficiency of FP-GAs, embedded processors, and DSPs in multiplying two ¢ ¤ £ ¥ ¢ matrices. As specific examples, we have chosen a representative of each… (More)

In this paper, new algorithms and architectures for matrix factorization are presented. Two fully-parallel and block-based designs for LU decomposition on configurable devices are proposed. A linear array architecture is employed to minimize the usage of long interconnects, leading to lower energy dissipation. The designs are made scalable by using a fixed… (More)

In this paper, we develop energy efficient designs for the Fast Fourier Transform (FFT) on FPGAs. Architectures for FFT on FPGAs are designed by investigating and applying techniques for minimizing the energy dissipation. Architectural parameters such as degrees of vertical and horizontal parallelism are identified and a design domain is created through a… (More)