In this paper, extended instructions for the advanced encryption standard (AES) cryptography acceleration in embedded processors and efficient implementation of these instructions are presented. These AES instructions generate four elements in single-instruction, multiple-data format from each input of an AES state. The instruction count for 128-bit key AES… (More)
In this paper, a fast radix-4 complex FFT implementation using 4-parallel SIMD instructions is presented. Four radix-4 butterflies are calculated in parallel at all stages by loading consecutive 4 elements into a register. At the last stage, every 4 elements is packed into a register and calculated in parallel. This regular data flow enables higher… (More)
This paper presents a low-power, 32-bit RISC microprocessor with a 64-bit " single-instruction multiple-data " multimedia coprocessor, V830R/AV, and its MPEG-2 video decoding performance. This coprocessor basically performs multimedia-oriented four 16-bit operations every clock, such as multiply-accumulate with symmetric rounding and saturation, and… (More)
Presented here is MPEG-2 AAC decoder software for a low-power embedded RISC microprocessor, NEC VS30 (300mW @133MHz). Fast processing methods for IMDCT reduce execution time by 41% and help achieve real-time decoding of a 5.1-channel audio signal, while using only 64.7% of processor capacity.