In this paper, a fast radix-4 complex FFT implementation using 4-parallel SIMD instructions is presented. Four radix-4 butterflies are calculated in parallel at all stages by loading consecutive 4 elements into a register. At the last stage, every 4 elements is packed into a register and calculated in parallel. This regular data flow enables higher… (More)
This paper presents a low-power, 32-bit RISC microprocessor with a 64-bit " single-instruction multiple-data " multimedia coprocessor, V830R/AV, and its MPEG-2 video decoding performance. This coprocessor basically performs multimedia-oriented four 16-bit operations every clock, such as multiply-accumulate with symmetric rounding and saturation, and… (More)
Presented here is MPEG-2 AAC decoder software for a low-power embedded RISC microprocessor, NEC VS30 (300mW @133MHz). Fast processing methods for IMDCT reduce execution time by 41% and help achieve real-time decoding of a 5.1-channel audio signal, while using only 64.7% of processor capacity.