Share This Author
Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers
For certain types of loops, it is shown analytically that guided self-scheduling uses minimal overhead and achieves optimal schedules, and experimental results that clearly show the advantage of guidedSelfScheduling over the most widely known dynamic methods are discussed.
The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers
- M. Berry, Da-Ren Chen, Joanne L. Martin
- Computer ScienceInt. J. High Perform. Comput. Appl.
- 1 September 1989
A methodology for measuring the performance of supercomputers, including 13 Fortran programs that total over 50,000 lines of source code, and a set of guidelines that allow portability to several types of machines are presented.
Dependence graphs and compiler optimizations
This paper defines such graphs and discusses two kinds of transformations, simple rewriting transformations that remove dependence arcs and abstraction transformations that deal more globally with a dependence graph.
The ILLIAC IV Computer
- G. H. Barnes, Richard M. Brown, Maso Kato, D. Kuck, D. Slotnick, Richard A. Stokes
- Computer ScienceIEEE Transactions on Computers
- 1 August 1968
Abstract—The structure of ILLIAC IV, a parallel-array computer containing 256 processing elements, is described. Special features include multiarray processing, multiprecision arithmetic, and fast…
A Survey of Parallel Machine Organization and Programming
- D. Kuck
- Computer ScienceCSUR
- 1 March 1977
This paper is a survey of parallel machine organizations and programming, and various aspects of machine organization are discussed, including processors, memories, and alignment networks.
On Stable Parallel Linear System Solvers
Three stable parallel algorithms for solving dense and tndlagonai systems of lmear equations are discussed and one of the algorithms presented here is superior to the best previous algorithm in that with a modest increase in time.
On the Number of Operations Simultaneously Executable in Fortran-Like Programs and Their Resulting Speedup
Algorithms are presented for handling arithmetic assignment statements, DO loops and IF statement trees, and evidence is given that for very simple Fortran programs 16 processors could be effectively used operating simultaneously in a parallel or pipeline fashion.
Supercomputer performance evaluation and the Perfect Benchmarks
The Perfect BenchmarkTM Suite has evolved from a supercomputer performance evaluation plan, presented by Kuck and Sameh at the 1987 International Conference on Supercomputing, to a vigorous international activity.
The Burroughs Scientific Processor (BSP)
The Burroughs Scientific Processor (BSP), a high-performance computer system, performed the Department of Energy LLL loops at roughly the speed of the CRAY-1. The BSP combined parallelism and…
CEDAR: a large scale multiprocessor
Various aspects of the Cedar, a large scale multiprocessor being designed at the University of Illinois, are described including the control methodology, communication network, optimizing compiler and plans for construction.