Michael Lang

Learn More
Roadrunner is a 1.38 Pflop/s-peak (double precision) hybrid-architecture supercomputer developed by LANL and IBM. It contains 12,240 IBM PowerXCell 8i processors and 12,240 AMD Opteron cores in 3,060 compute nodes. Roadrunner is the first supercomputer to run Linpack at a sustained speed in excess of 1 Pflop/s. In this paper we present a detailed(More)
Clustered systems have become a dominant architecture of scalable high-performance super computers. In these large-scale computers, the network performance and scalability is as critical as the computenodes speed. InfiniBandTM has become a commodity networking solution supporting the stringent latency, bandwidth and scalability requirements of these(More)
In order to establish the relative distribution of a GFAP-negative population of astrocytes, and its change in gliotic tissue, sections of the stratum radiatum of the CA1 hippocampal layer of male, adult, Wistar rats were analyzed by immunocytochemical methods. Ten micrometer-thick sections were triple-stained to detect nuclei, glial fibrillary acidic(More)
This work provides a performance analysis of three leading supercomputers that have recently been deployed: Purple, Red Storm and Blue Gene/L. Each of these machines are architecturally diverse, with very different performance characteristics. Each contains over 10,000 processors and has a system peak of over 40 Teraflops. We analyze each system using a(More)
In this work we present an initial performance evaluation of Intel's latest, secondgeneration quad-core processor, Nehalem, and provide a comparison to first-generation AMD and Intel quad-core processors Barcelona and Tigerton. Nehalem is the first Intel processor to implement a NUMA architecture incorporating QuickPath Interconnect for interconnecting(More)
In the late 1990’s, there was substantial activity within the “Web engineering” research community and a multitude of new Web approaches were proposed. However, numerous studies have revealed major gaps in these approaches, including coverage and interoperability. In order to address these gaps, the Model-Driven Engineering (MDE) paradigm offers a new(More)
Load balancing techniques (e.g. work stealing) are important to obtain the best performance for distributed task scheduling systems that have multiple schedulers making scheduling decisions. In work stealing, tasks are randomly migrated from heavy-loaded schedulers to idle ones. However, for data-intensive applications where tasks are dependent and task(More)
Based on a set of measurements done on the 512-node 500MHz prototype and early results on a 2048 node 700MHz BlueGene/L machine at IBM Watson, we present a performance and scalability analysis of the architecture from low-level characteristics to large-scale applications. In addition, we present predictions using our models for the performance of two(More)
One way to efficiently utilize the coming exascale machines is to support a mixture of applications in various domains, such as traditional large-scale HPC, the ensemble runs, and the fine-grained many-task computing (MTC). Delivering high performance in resource allocation, scheduling and launching for all types of jobs has driven us to develop Slurm++, a(More)