Learn More
DARPA's Ubiquitous High-Performance Computing (UHPC) program asked researchers to develop computing systems capable of achieving energy efficiencies of 50 GOPS/Watt, assuming 2018-era fabrication technologies. This paper describes Runnemede, the research architecture developed by the Intel-led UHPC team. Runnemede is being developed through a co-design(More)
—Simulations of biologically realistic neurons in large densely connected networks pose many problems to application programmers, particularly on distributed memory computers. We discuss simulations of hundreds of thousands to millions of cells in a model of neocortex in the context of new computing platforms with many tens of thousands to hundreds of(More)
High-performance computing has been on an inexorable march from gigascale to tera-and petascale, with many researchers now actively contemplating exascale (10<sup>18</sup>, or a million trillion operations per second) systems. This progression is being accelerated by the rapid increase in multi-and many-core processors, which allow even greater(More)
Large-scale internet services aim to remain highly available and responsive in the presence of unexpected failures. Providing this service often requires monitoring and analyzing tens of millions of measurements per second across a large number of systems, and one particularly effective solution is to store and query such measurements in a time series(More)
We address the problem of scheduling applications represented as directed acyclic task graphs (DAGs) onto architectures with reconfigurable processing cores. We introduce the Mutually Exclusive Processor Groups reconfiguration model, a novel reconfiguration model that captures many different modes of reconfiguration. Additionally, we propose the(More)
Combining ideas from several previous proposals, such as Active Pages, DIVA, and ULMT, we present the Memory Arithmetic Unit and Interface (MAUI) architecture. Because the "intelligence" of the MAUI intelligent memory system architecture is located in the memory-controller, logic and DRAM are not required to be integrated into a single chip, and use of(More)
High-performance computing has been on an inexorable march from gigascale to tera-and petascale, with many researchers now actively contemplating exascale (10 18 , or a million trillion operations per second) systems. This progression is being accelerated by the rapid increase in multi-and many-core processors, which allow even greater opportunities for(More)
We address the problem of scheduling parallel applications onto Heterogeneous Chip Multi-Processors (H- CMPs) containing reconfigurable processing cores. To model reconfiguration, we introduce the novel Mutually Exclusive Processor Groups reconfiguration model, which captures many different modes of reconfiguration. The paper continues by proposing the(More)
In this paper, we propose a static (compile-time) scheduling extension that considers reconfiguration and task execution together when scheduling tasks on reconfigurable hardware, designated as Mutually Exclusive Groups (-MEG), that can be used to extend any static list scheduler. In simulation, using -MEG generates higher quality schedules than those(More)
The memory system is increasingly becoming a performance bottleneck. Several intelligent memory systems, such as the ActivePages, DIVA, and IRAM archi-tectures, have been proposed to alleviate the processor-memory bottleneck. This thesis presents the Memory Arithmetic Unit and Interface (MAUI) architecture. The MAUI architecture combines ideas of the(More)