Limits of Data-Level Parallelism

Abstract

A new breed of processors like the Cell Broadband Engine, the Imagine stream processor and the various GPU processors emphasize data-level parallelism (DLP) and threadlevel parallelism (TLP) as opposed to traditional instructionlevel parallelism (ILP). This allows them to achieve order-ofmagnitude improvements over conventional superscalar processors for many workloads. However, it is unclear as to how much parallelism of these types exists in current programs. Most earlier studies have largely concentrated on the amount of ILP in a program, without differentiating DLP or TLP. In this study, we investigate the extent of data-level parallelism available in programs in the MediaBench suite. By packing instructions in a SIMD fashion, we observe reductions of up to 91% (84% on average) in the number of dynamic instructions, indicating a very high degree of DLP in several applications.

7 Figures and Tables

Cite this paper

@inproceedings{Pai2007LimitsOD, title={Limits of Data-Level Parallelism}, author={Sreepathi Pai and R. Govindarajan and Matthew J. Thazhuthaveetil}, year={2007} }