Javier D. Bruguera

Learn More
New image and video coding standards have pushed the limits of compression by introducing new techniques with high computational demands. The Advanced Video Coder (ITU-T H.264, AVC MPEG-4 Part 10) is the last international standard, which introduces new enhanced features that require new levels of performance. Among the new tools present in AVC, the(More)
A table-based method for high-speed function approximation in single-precision floating-point format is presented in this paper. Our focus is the approximation of reciprocal, square root, square root reciprocal, exponentials, logarithms, trigonometric functions, powering (with a fixed exponent p), or special functions. The algorithm presented here combines(More)
A new method for the high-speed computation of double-precision floating-point reciprocal, division, square root, and inverse square root operations is presented in this paper. This method employs a second-degree minimax polynomial approximation to obtain an accurate initial estimate of the reciprocal and the inverse square root values, and then performs a(More)
CORDIC (Coordinate Rotation Digital Computer) is a well known hardware algorithm for computing various elementary functions. In this work two 32 bit radix 4 CORDIC architectures, unfolded and folded are implemented on available FPGA. The unfolded pipelined architecture consists of a linear array of modules in each of which a micro rotation is carried out.(More)
In this paper we propose an efficient implementation of CABAC's binary arithmetic coder and context management system. CABAC is the context adaptive binary arithmetic coder used in new H.264/AVC video standard. Arithmetic coding allows a significant enhancement in compression. However, implementation complexity is a drawback due to hardware cost and(More)
We present the design of parallel architectures for the computation of the Hough transform based on application-specific CORDIC processors. The design of the circular CORDIC in rotation mode is simplified by the a priori knowledge of the angles participating in the transform and a high throughput is obtained through a pipelined design combined with the use(More)
We present a VLSI architecture for the evaluation of the (8~8)-point 2-13 DCT with on-line arithmetic. The utilization #of on-lie arithmetic, in combination with an algorithm based on FCT and matrix multiplication, reduces the total hardware maintaining a data rate and a latency similar to approaches based on distributed or parallel arithmetic. The(More)