#### Filter Results:

- Full text PDF available (78)

#### Publication Year

2005

2017

- This year (5)
- Last 5 years (39)
- Last 10 years (77)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Alfredo Buttari, Julien Langou, Jakub Kurzak, Jack J. Dongarra
- Parallel Computing
- 2009

As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose… (More)

- Emmanuel Agullo, Jim Demmel, +6 authors Stanimire Tomov
- 2009

The emergence and continuing use of multi-core architectures and graphics processing units require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) and Matrix Algebra on GPU and Multics… (More)

- Alfredo Buttari, Julien Langou, Jakub Kurzak, Jack J. Dongarra
- PPAM
- 2007

As multicore systems continue to gain ground in the high-performance computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine-grain parallelism becomes a major requirement and introduces the necessity of loose… (More)

- Jakub Kurzak, Stanimire Tomov, Jack J. Dongarra
- IEEE Transactions on Parallel and Distributed…
- 2012

In recent years, the use of graphics chips has been recognized as a viable way of accelerating scientific and engineering applications, even more so since the introduction of the Fermi architecture by NVIDIA, with features essential to numerical computing, such as fast double precision arithmetic and memory protected with error correction codes. Being the… (More)

- Jakub Kurzak, Hatem Ltaief, Jack J. Dongarra, Rosa M. Badia
- Concurrency and Computation: Practice and…
- 2010

State-of-the-art dense linear algebra software, such as the LAPACK and ScaLAPACK libraries, suffers performance losses on multicore processors due to their inability to fully exploit thread-level parallelism. At the same time, the coarse–grain dataflow model gains popularity as a paradigm for programming multicore architectures. This work looks at… (More)

- Julie Langou, Julien Langou, Piotr Luszczek, Jakub Kurzak, Alfredo Buttari, Jack Dongarra
- SC '06
- 2006

Recent versions of microprocessors exhibit performance characteristics for 32 bit floating point arithmetic (single precision) that is substantially higher than 64 bit floating point arithmetic (double precision). Examples include the Intel's Pentium IV and M processors, AMD's Opteron architectures and the IBM's Cell Broad Engine processor. When working in… (More)

- Jakub Kurzak, Alfredo Buttari, Jack J. Dongarra
- IEEE Transactions on Parallel and Distributed…
- 2008

The Sony/Toshiba/IBM (STI) CELL processor introduces pioneering solutions in processor architecture. At the same time it presents new challenges for the development of numerical algorithms. One is effective exploitation of the differential between the speed of single and double precision arithmetic; the other is efficient parallelization between the short… (More)

- Jakub Kurzak, Jack J. Dongarra
- PARA
- 2006

Linear algebra algorithms commonly encapsulate parallelism in Basic Linear Algebra Subroutines (BLAS). This solution relies on the fork-join model of parallel execution, which may result in suboptimal performance on current and future generations of multi-core processors. To overcome the shortcomings of this approach a pipelined model of parallel execution… (More)