Learn More
Most image processing applications are computationally intensive and data intensive. Reconfigurable hardware boards provide a convenient and flexible solution to speed up these algorithms. To get a high performance design without going through the time-consuming hardware design process for each different algorithm, we present a universal parameterized(More)
Window operations which are computationally intensive and data intensive are frequently used in image compression, pattern recognition and digital signal processing. The efficiency of memory accessing often dominates the overall computation performance, and the problem becomes increasingly crucial in reconfigurable systems. The challenge is to intelligently(More)
The current paper explores the capability and flexibility of field programmable gate-arrays (FPGAs) to implement variable-precision floating-point (VP) arithmetic. First, the VP exact dot product algorithm, which uses exact fixed-point operations to obtain an exact result, is presented. A VP multiplication and accumulation unit (VPMAC) on FPGA is then(More)
There are large numbers of high-level algorithms consisting of multiple loop nests in image compression, pattern recognition and digital signal processing. FPGA provides a convenient and flexible solution to speed up these loop-intensive algorithms. However, FPGA reconfiguration which needs a long time is inevitable when switching between the loop nests.(More)
In this article, a unified VLIW coprocessor, based on a common group of atomic operation units, for Quad arithmetic and elementary functions (QP_VELP) is presented. The explicitly parallel scheme of VLIW instruction and Estrin's evaluation scheme for polynomials are used to improve the performance. A two-level VLIW instruction RAM scheme is introduced to(More)
Floating-point Fast Fourier Transform (FFT) processor and COordinate Rotation Digital Computer (CORDIC) element play important roles in communication and radar applications. But even with the rapid development of large-scale integrated circuit, it is usually impractical to implement these floating-point computations on FPGA, as they will consume a large(More)
  • 1