#### Filter Results:

#### Publication Year

2005

2017

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

- Alfredo Buttari, Julien Langou, Jakub Kurzak, Jack J. Dongarra
- Parallel Computing
- 2009

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be re-formulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose… (More)

- Emmanuel Agullo, Jim Demmel, +6 authors Stanimire Tomov
- 2009

The emergence and continuing use of multi-core architectures and graphics processing units require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) and Matrix Algebra on GPU and Multics… (More)

- Jakub Kurzak, Alfredo Buttari, Jack J. Dongarra
- IEEE Transactions on Parallel and Distributed…
- 2008

The Sony/Toshiba/IBM (STI) CELL processor introduces pioneering solutions in processor architecture. At the same time it presents new challenges for the development of numerical algorithms. One is effective exploitation of the differential between the speed of single and double precision arithmetic; the other is efficient parallelization between the short… (More)

- Jakub Kurzak, Wesley Alvaro, Jack J. Dongarra
- Parallel Computing
- 2009

Matrix multiplication is one of the most common numerical operations, especially in the area of dense linear algebra, where it forms the core of many important algorithms, including solvers of linear systems of equations, least square problems, and singular and eigen-value computations. The STI CELL processor exceeds the capabilities of any other processor… (More)

- Jakub Kurzak, Stanimire Tomov, Jack J. Dongarra
- IEEE Transactions on Parallel and Distributed…
- 2012

In recent years, the use of graphics chips has been recognized as a viable way of accelerating scientific and engineering applications, even more so since the introduction of the Fermi architecture by NVIDIA, with features essential to numerical computing, such as fast double precision arithmetic and memory protected with error correction codes. Being the… (More)

- Alfredo Buttari, Julien Langou, Jakub Kurzak, Jack J. Dongarra
- PPAM
- 2007

SUMMARY As multicore systems continue to gain ground in the high-performance computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine-grain parallelism becomes a major requirement and introduces the necessity of loose… (More)

- Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari, Jack Dongarra
- SC '06
- 2006

Recent versions of microprocessors exhibit performance characteristics for 32 bit floating point arithmetic (single precision) that is substantially higher than 64 bit floating point arithmetic (double precision). Examples include the Intel's Pentium IV and M processors, AMD's Opteron architectures and the IBM's Cell Broad Engine processor. When working in… (More)

The dataflow model is gaining popularity as a paradigm for programming multicore processors and multi-socket systems of such processors. This work proposes a programming interface and an implementation for a dataflow-based scheduler, which dispatches tasks dynamically at runtime. The scheduler relies on data dependency analysis between tasks in a sequential… (More)

It is difficult to estimate the magnitude of the discontinuity that the high performance computing (HPC) community is about to experience because of the emergence of the next generation of multi-core and heterogeneous processor designs [4]. For at least two decades, HPC programmers have taken for granted that each successive generation of microprocessors… (More)