A Synergetic Approach to Throughput Computing on x86-Based Multicore Desktops

To exploit the full performance potential of multicore desktops, the authors propose an approach that combines cache optimization, parallelization, simdization, and autotuning in a single framework.