Richard L. Sites

Learn More
This article describes the Digital Continuous Profiling Infrastructure, a sampling-based profiling system designed to run continuously on production systems. The system supports multiprocessors, works on unmodified executables, and collects profiles for entire systems, including user programs, shared libraries, and the operating system kernel. Samples are(More)
Trace-driven simulation is often used in the design of computer systems, especially caches and translation lookaside buffers. Capturing address traces to drive such simulations has been problematic, often involving 1000:1 software overhead to trace a target workload, and/or mechanisms that cause significant distortions in the recorded data. A new technique(More)
hen Digital started to design the Alpha AXP architecture in the fall of 1988, the Alpha A X P team was concerned with running ex is t ing VAX TM code and MIPS TM code on the new Alpha AXP computers [5, 6]. To get full performance on a new computer architecture, an application must be ported by rebuilding, using native compilers. For a single program written(More)
The design of high-performance multiprocessor systems necessitates a careful analysis of the memory system performance of parallel programs. Lacking multiprocessor address traces, previous multiprocessor performance studies using analytical models had to make an inordinate number of assumptions about the underlying memory reference patterns. We previously(More)
he Alpha A X P TM architecture grew out of a small task force chartered in 1988 to explore ways to preserve the V A X / V M S TM customer base through the 1990s. This group eventually came to the conclusion that a new RISC architecture would be needed before the turn of the century, primari ly because 32-bi t architectures will run out of address bits. Once(More)
A variant of PASCAL pseudo-code which is suitable for optimization is presented. This new language, Universal pseudo-code, is designed to be easily extended to meet the needs of a variety of target machines. The language is further designed such that only one optimizer need be written for it. This approach lends itself well to the portable software spirit(More)
The context of this paper is a machine-independent Pascal optimizer that transforms an intermediate stack-machine pseudo-code program into a generally smaller and faster pseudo-code program. The emphasis of this current paper is on the approach taken for mapping registers and storage, using an abstract but practical definition of the target machine's(More)
The Cray-1 is an extremely high-speed computer, intended to be used for large floating-point scientific computations. However, it is a well-balanced machine that can gracefully be used on a wide class of problems. The machine has two major architectural innovations: (1) 128 backup registers which represent a new layer in the memory hierarchy, essentially a(More)