#### Filter Results:

- Full text PDF available (15)

#### Publication Year

2009

2017

- This year (2)
- Last five years (13)

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

- Matthias Petschow, Elmar Peise, Paolo Bientinesi
- SIAM J. Scientific Computing
- 2013

We introduce a new collection of solvers – subsequently called EleMRRR – for large-scale dense Hermitian eigenproblems. EleMRRR solves various types of problems: generalized, standard, and tridiagonal eigenproblems. Among these, the last is of particular importance as it is a solver on its own right, as well as the computational kernel for the first two; we… (More)

- Edoardo Di Napoli, Elmar Peise, Markus Hrywniak, Paolo Bientinesi
- Computer Physics Communications
- 2017

One of the greatest effort of computational scientists is to translate the mathematical model describing a class of physical phenomena into large and complex codes. Many of these codes face the difficulty of implementing the mathematical operations in the model in terms of low level optimized kernels offering both performance and portability. Legacy codes… (More)

- Elmar Peise, Paolo Bientinesi
- 2012 SC Companion: High Performance Computing…
- 2012

It is well known that the behavior of dense linear algebra algorithms is greatly influenced by factors like target architecture, underlying libraries and even problem size; because of this, the accurate prediction of their performance is a real challenge. In this article, we are not interested in creating accurate models for a given algorithm, but in… (More)

- Elmar Peise, Diego Fabregat-Traver, Paolo Bientinesi
- PMBS@SC
- 2014

Tensor operations are surging as the computational building blocks for a variety of scientific simulations and the development of high-performance kernels for such operations is known to be a challenging task. While for operations on one-and two-dimensional tensors there exist standardized interfaces and highly-optimized libraries (BLAS), for higher… (More)

We introduce a new collection of solvers – subsequently called EleMRRR – for large-scale dense Hermitian eigenproblems. EleMRRR solves various types of problems: generalized, standard, and tridiagonal eigenproblems. Among these, the last is of particular importance as it is a solver on its own right, as well as the computational kernel for the first two; we… (More)

- Elmar Peise
- ArXiv
- 2012

I hereby declare that this thesis is entirely the result of my own work except where otherwise indicated. I have only used the resources given in the list of references.

- Elmar Peise, Diego Fabregat-Traver, Yurii S. Aulchenko, Paolo Bientinesi
- EuroMPI
- 2013

In order to associate complex traits with genetic polymorphisms, genome-wide association studies process huge datasets involving tens of thousands of individuals genotyped for millions of polymorphisms. When handling these datasets, which exceed the main memory of contemporary computers, one faces two distinct challenges: 1) Millions of polymorphisms come… (More)

- Elmar Peise, Paolo Bientinesi
- VECPAR
- 2014

It is universally known that caching is critical to attain high-performance implementations: In many situations, data locality (in space and time) plays a bigger role than optimizing the (number of) arithmetic floating point operations. In this paper, we show evidence that at least for linear algebra algorithms, caching is also a crucial factor for accurate… (More)

- Elmar Peise, Diego Fabregat-Traver, Paolo Bientinesi
- Parallel Computing
- 2015

In order to associate complex traits with genetic polymorphisms, genome-wide association studies process huge datasets involving tens of thousands of individuals genotyped for millions of polymorphisms. When handling these datasets, which exceed the main memory of contemporary computers, one faces two distinct challenges: 1) Millions of polymorphisms and… (More)

- Elmar Peise, Paolo Bientinesi
- ArXiv
- 2016

To exploit both memory locality and the full performance potential of highly tuned kernels, dense linear algebra libraries such as LAPACK commonly implement operations as blocked algorithms. However, to achieve next-to-optimal performance with such algorithms, significant tuning is required. On the other hand, recursive algorithms are virtually tuning free,… (More)