Implementation of parallel full search algorithm for motion estimation on multi-core processors
Motion estimation constitutes a significant computational part of video standards such as MPEG2, MPEG4, and H264/AVC. This paper evaluates the performance of a motion estimation algorithm on the TM3270, a low-cost media-processor. In order to improve performance, the TM3270 processor provides architectural enhancements over previous TriMedia processors. We quantify the speedup of the proposed <i>new operations</i> to motion estimation performance. We show that the new operations incorporated in the TM3270 improve performance by a factor between 3 and 4. Furthermore, we quantify the speedup of <i>data prefetching</i>. We show that prefetching can improve performance up to 30%. By applying all TM3270 architectural enhancements, we show that standard resolution motion estimation can be performed in less than 5% of the available processor performance.
Unfortunately, ACM prohibits us from displaying non-influential references for this paper.
To see the full reference list, please visit http://dl.acm.org/citation.cfm?id=1066872.