Learn More
System level simulators allow computer architects and system software designers to recreate an accurate and complete replica of the program behavior of a target system, regardless of the availability, existence, or in-strumentation support of such a system. Applications include evaluation of architectural design alternatives as well as software engineering(More)
It is a common belief that computer performance growth is over 50% annually, or that performance doubles every 18-20 months. By analyzing publicly available results from the SPEC integer (CINT) benchmark suites, we conclude that this was true between 1985 and 1996 -- the early years of the RISC paradigm.During the last 7.5 years (1996-2004), however,(More)
Two-level coherence predictors have shown great promise to reduce coherence overhead in shared memory multipro-cessors. However, to be accurate they require a memory overhead that on e.g. a 64-processor machine can be as high as 50%. Based on an application case study consisting of seven applications from SPLASH-2, a first observation made in this paper is(More)
On-line transaction processing exhibits poor memory behavior in high-end multiprocessor servers because of complex sharing patterns and substantial interaction between the database server and the operating system. One contributing source is a large amount of load-store sequences in the program, resulting in many read misses as well as much global(More)
We propose a powerful hardware architecture for pixel shading, which enables flexible control of shading rates and automatic shading reuse between triangles in tessellated primitives. The main goal is efficient pixel shading for moderately to finely tessellated geometry, which is not handled well by current GPUs. Our method effectively decouples the cost of(More)
Parallel programs that modify shared data in a cache-coherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisitions at writes to shared data. This can have a significant impact on performance in a cache-coherent non-uniform memory architecture (NUMA) multiprocessor. By combining a(More)
We here reconstruct the paleotopography of Northern Hemisphere ice sheets during the glacial maxima of marine isotope stages (MIS) 5b and 4.We employ a combined approach, blending geologically based reconstruction and numerical modeling, to arrive at probable ice sheet extents and topographies for each of these two time slices. For a physically based 3-D(More)
This paper assumes the availability of a very fast higher-dimensional rasterizer in future graphics processors. Working in up to five dimensions, i.e., adding time and lens parameters, it is well-known that this can be used to render scenes with both motion blur and depth of field. Our hypothesis is that such a rasterizer can also be used as a flexible tool(More)