Learn More
—This work explores the data reuse properties of full-search block-matching (FSBM) for motion estimation (ME) and associated architecture designs, as well as memory bandwidth requirements. Memory bandwidth in high-quality video is a major bottleneck to designing an implementable architecture because of large frame size and search range. First, memory(More)
ÐThis paper presents a design methodology for high-speed Booth encoded parallel multiplier. For partial product generation, we propose a new modified Booth encoding (MBE) scheme to improve the performance of traditional MBE schemes. For final addition, a new algorithm is developed to construct multiple-level conditional-sum adder (MLCSMA). The proposed(More)
—This paper presents a memory-efficient approach to realize the cyclic convolution and its application to the discrete cosine transform (DCT). We adopt the way of distributed arithmetic (DA) computation, exploit the symmetry property of DCT coefficients to merge the elements in the matrix of DCT kernel, separate the kernel to be two perfect cyclic forms,(More)
—The ongoing advancements in VLSI technology allow system-on-a-chip (SoC) design to integrate heterogeneous control and computing functions into a single chip. On the other hand, the pressures of area and cost lead to the requirement for a single, shared off-chip DRAM memory subsystem. To satisfy different memory access requirements for latency and(More)
—This paper presents a cost-effective processor core design that features the simplest hardware and is suitable for discrete cosine transform/indiscrete cosine transform (DCT/IDCT) operations in H.263 and digital camera. This design combines the techniques of fast direct two-dimensional DCT algorithm, the bit-level adder-based distributed arithmetic, and(More)