Corpus ID: 62143065

The Landscape of Parallel Computing Research: A View from Berkeley

@inproceedings{Asanovi2006TheLO,
  title={The Landscape of Parallel Computing Research: A View from Berkeley},
  author={K. Asanovi{\'c} and R. Bod{\'i}k and Bryan Catanzaro and Joseph Gebis and P. Husbands and K. Keutzer and D. Patterson and W. Plishker and J. Shalf and Samuel Williams and K. Yelick},
  year={2006}
}
Author(s): Asanovic, K; Bodik, R; Catanzaro, B; Gebis, J; Husbands, P; Keutzer, K; Patterson, D; Plishker, W; Shalf, J; Williams, SW | Abstract: The recent switch to parallel microprocessors is a milestone in the history of computing. Industry has laid out a roadmap for multicore designs that preserves the programming paradigm of the past via binary compatibility and cache coherence. Conventional wisdom is now to double the number of cores on a chip with each silicon generation. A… Expand
Auto-tuning performance on multicore computers
TLDR
It is shown that auto-tuning consistently delivers speedups in excess of 3× across all multicore computers except the memory-bound Intel Clovertown, where the benefit was as little as 1.5×. Expand
Communication for programmability and performance on multi-core processors
TLDR
This dissertation considers the programmability challenges of the multi-core era, and proposes and describes an asynchronous remote store instruction, which is issued by one core and completed asynchronously by another into its own local cache, and evaluates several patterns of parallel communication. Expand
Optimization of Scientific Computation for Multicore Systems
TLDR
This thesis examines the problems of sorting, matrix multiplication, and ordinary differential equation initial value problems on two target architectures, the Cell Broadband Engine, and the Nvidia CUDA enabled graphics processor to exploit various levels of parallelism. Expand
Pitfalls and Issues of Manycore Programming
TLDR
This chapter explains the primary difficulties and issues of manycore programming and the human factor in the success of the parallel revolution. Expand
HPPC 2007: Workshop on Highly Parallel Processing on a Chip
TLDR
It is argued that the PRAM-On-Chip approach is a promising candidate for providing the processor-of-the-future and focusing on a small number of promising approaches would be most beneficial both for the field as a whole and for an individual researcher who is seeking improved impact. Expand
The Parallel Revolution Has Started: Are You Part of the Solution or Part of the Problem? - An Overview of Research at the Berkeley Parallel Computing Laboratory
TLDR
This talk gives an update on where the Par Lab is two years on, including a surprisingly compact set of recurring computational patterns, which are termed "motifs", and believes that any successful software architecture, parallel or serial, can be described as a hierarchy of patterns. Expand
OpenCL and the 13 dwarfs: a work in progress
TLDR
The goal of this combination "Work-in-Progress and Vision" paper is to delineate application requirements in a manner that is not overly specific to individual applications or the optimizations used for certain hardware platforms, so that the authors can draw broader conclusions about hardware requirements. Expand
The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?
TLDR
This paper investigates current approaches to portable accelerator programming, seeking to answer whether they make it possible to combine high efficiency with sufficient algorithm abstraction, and presents three approaches of writing portable code: GPU-centric, CPU-centric and combined. Expand
Operating System Support for Parallel Processes
TLDR
This work describes the MCP abstraction and the salient details of Akaros, and discusses how the kernel and user-level libraries work together to give an application control over its physical resources and to adapt to the revocation of cores at any time - even when the code is holding locks. Expand
Can One-Chip Parallel Computing Be Liberated From Ad Hoc Solutions ? A Computation Model Based Approach and Its Implementation
In July 2010 David Patterson said in IEEE Spectrum that “the semiconductor industry threw the equivalent of a Hail Mary pass when it switched from making microprocessors run faster to putting more ofExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 142 REFERENCES
RAMP: research accelerator for multiple processors - a community vision for a shared experimental parallel HW/SW platform
TLDR
The acronynm RAMP, for Research Accelerator for Multiple Processors, has the potential to transform the parallel computing community in computer science from a simulation-driven to a prototype-driven discipline, leading to rapid iteration across interfaces of the many fields of multiple processors, and thereby moving much more quickly to a parallel foundation for large-scale computer systems research in the 21st century. Expand
The Parallel Computing Laboratory at U.C. Berkeley: A Research Agenda Based on the Berkeley View
TLDR
This report is based on a proposal for creating a Universal Parallel Computing ResearchCenter (UPCRC) that a technical committee from Intel and Microsoft unanimously selected as the top proposal in a competition with the top 25 computer science departments. Expand
X10: an object-oriented approach to non-uniform cluster computing
TLDR
A modern object-oriented programming language, X10, is designed for high performance, high productivity programming of NUCC systems and an overview of the X10 programming model and language, experience with the reference implementation, and results from some initial productivity comparisons between the X 10 and Java™ languages are presented. Expand
Microprocessors for the new millennium: Challenges, opportunities, and new frontiers
  • P. Gelsinger
  • Computer Science
  • 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177)
  • 2001
TLDR
Future microprocessors will evolve as integration of DSP capabilities becomes imperative to enable such applications as media-rich communications, computer vision, and speech recognition, which will lead to a change in the computing paradigm from today's data-based, machine-based computing to tomorrow's knowledge- based, human- based computing. Expand
The Tera computer system
TLDR
The Tera architecture was designed with several goals in mind; it needed to be suitable for very high speed implementations, i. Expand
An Experiment in Measuring the Productivity of Three Parallel Programming Languages
In May 2005, a 4.5 day long productivity study was performed at the Pittsburgh Supercomputing Center as part of the IBM HPCS/PERCS project, comparing the productivity of three parallel programmingExpand
A stream compiler for communication-exposed architectures
TLDR
This paper describes a fully functional compiler that parallelizes StreamIt applications for Raw, including several load-balancing transformations, and demonstrates that the StreamIt compiler can automatically map a high-level stream abstraction to Raw without losing performance. Expand
High-level programming language abstractions for advanced and dynamic parallel computations
TLDR
By including a set of p-dependent abstractions into a language with a largely p-independent framework, the task of parallel programming is greatly simplified, and ZPL code is shown to be easier to write than MPI code even while the performance is competitive with MPI. Expand
The cascade high productivity language
  • D. Callahan, B. Chamberlain, H. Zima
  • Computer Science
  • Ninth International Workshop on High-Level Parallel Programming Models and Supportive Environments, 2004. Proceedings.
  • 2004
TLDR
The design of Chapel, the cascade high productivity language, is described, which is being developed in the DARPA-funded HPCS project Cascade led by Cray Inc, and pushes the state-of-the-art in languages for HEC system programming by focusing on productivity. Expand
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
TLDR
A BLAS GEMM compatible multi-level cache-blocked matrix multiply generator which produces code that achieves around 90% of peak on the Sparcstation-20/61, IBM RS/6000-590, HP 712/8Oi, SGI Power Challenge RBk, and SGI Octane RlOk, and over 80% ofpeak on the SGI Indigo R4k. Expand
...
1
2
3
4
5
...