• Publications
  • Influence
The SPLASH-2 programs: characterization and methodological considerations
TLDR
The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. Expand
  • 4,042
  • 551
  • PDF
SPLASH: Stanford parallel applications for shared-memory
TLDR
We present the Stanford Parallel Applications for Shared-Memory (SPLASH), a set of parallel applications for use in the design and evaluation of shared-memory multiprocessing systems. Expand
  • 1,174
  • 77
  • PDF
Memory consistency and event ordering in scalable shared-memory multiprocessors
TLDR
A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced, with the discussion concentrating on issues relevant to scalable architectures. Expand
  • 724
  • 67
The Stanford Dash multiprocessor
TLDR
The overall goals and major features of the directory architecture for shared memory (Dash) are presented. Expand
  • 1,092
  • 57
  • PDF
Parallel computer architecture - a hardware / software approach
TLDR
The most exciting development in parallel computer architecture is the convergence of traditionally disparate approaches on a common machine structure. Expand
  • 1,148
  • 45
  • PDF
The directory-based cache coherence protocol for the DASH multiprocessor
TLDR
DASH is a scalable shared-memory multiprocessor currently being developed at Stanford's Computer Systems Laboratory. Expand
  • 744
  • 40
  • PDF
Memory consistency and event ordering in scalable shared-memory multiprocessors
TLDR
This paper introduces a new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models. Expand
  • 730
  • 39
  • PDF
Design and evaluation of a compiler algorithm for prefetching
TLDR
This paper proposes a compiler algorithm to insert prefetch instructions into code that operates on dense matrices. Expand
  • 818
  • 28
  • PDF
Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes
As multiprocessors are scaled beyond single bus systems, there is renewed interest in directory-based cache coherence schemes. These schemes rely on a directory to keep track of all processorsExpand
  • 303
  • 26
  • PDF
Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
TLDR
Shared-memory multiprocessors are frequently used in a time-sharing style with multiple parallel applications executing at the same time. Expand
  • 326
  • 24
  • PDF