Learn More
Impulse is a new memory system architecture that adds two important features to a traditional memory controller. First, Impulse supports application-specific optimizations through configurable physical address remapping. By remapping physical addresses, applications control how their data is accessed and cached, improving their cache and bus utilization.(More)
Modern operating systems must support a wide variety of services for a diverse set of users. Designers of these systems face a tradeoo between functionality and performance. Systems like Mach provide a set of general abstractions and attempt to handle every situation, which can lead to poor performance for common cases. Other systems, such as Unix, provide(More)
Shared memory programs guarantee the correctness of concurrent accesses to shared data using interprocessor synchronization operations. The most common synchronization operators are locks, which are traditionally implemented via a mix of shared memory accesses and hardware synchronization primitives like test-and-set. In this paper, we argue that(More)
In this paper, we describe the design of the Avalanchemultiprocessor's shared memory subsystem , evaluate its performance, and discuss problems associated with using commodity workstations and network interconnects as the building blocks of a scalable shared memory multiprocessor. Compared to other scalable shared memory architectures, Avalanchehas a number(More)
Scalable shared memory multiprocessors traditionally use either a cache coherent non-uniform memory access (CC-NUMA) or simple cache-only memory architecture (S-COMA) memory architecture. Recently, h ybrid architectures that combine aspects of both CC-NUMA and S-COMA have emerged. In this paper, we present t wo i m p r o vements over other hybrid(More)
EEcient synchronization is an essential component of parallel computing. The designers of traditional multiprocessors have included hardware support only for simple operations such as compare-and-swap and load-linked/store-conditional, while high level synchronization primitives such as locks, barriers, and condition variables have been implemented in(More)
As the gap between processor and memory speeds widens, system designers will inevitably incorporate increasingly deep memory hierarchies to maintain the balance between processor and memory system performance. At the same time, most communication subsystems are permitted access only to main memory and not a processor's top level cache. As memory latencies(More)
Because irregular applications have unpredictable memory access patterns, their performance is dominated by memory behavior. The Impulse conngurable memory controller will enable signiicant performance improvements for irregular applications, because it can be con-gured to optimize memory accesses on an application-by-application basis. In this paper we(More)
Conventional object-oriented programming systems allow application programmers to structure each application as a set of objects. They do not allow long-term storage of the objects, nor do they allow sharing and concurrency within the object spaces. Persistent object systems and object-oriented databases have been developed to address some of these(More)