- Javier Cuenca, Luis-Pedro García, Domingo Giménez, Jack J. Dongarra
- 2005 IEEE International Conference on Cluster…
- 2005

This paper presents a self-optimization methodology for parallel linear algebra routines on heterogeneous systems. For each routine, a series of decisions is taken automatically in order to obtain an… (More)

- Luis-Pedro García, Javier Cuenca, Domingo Giménez
- ICSOFT
- 2007

The design of hierarchies of libraries helps to obtain modular and efficient sets of routines to solve problems of specific fields. An example is ScaLAPACK’s hierarchy in the field of parallel linear… (More)

- Javier Cuenca, Luis-Pedro García, Domingo Giménez
- 2012 20th Euromicro International Conference on…
- 2012

The most computationally demanding scientific and engineering problems are solved with large parallel systems. In some cases those systems are Non-Uniform Memory Access multiprocessors made up of a… (More)

Introduction Multicore processor, cc-NUMA systems can offer performance improvements Necessary software optimization techniques to benefit from the potential of the hardware Modelling the execution… (More)

- Jesús Cámara, Javier Cuenca, Domingo Giménez, Luis-Pedro García, Antonio M. Vidal
- International Journal of Parallel Programming
- 2013

The introduction of auto-tuning techniques in linear algebra shared-memory routines is analyzed. Information obtained in the installation of the routines is used at running time to take some… (More)

- Jesús Cámara, Javier Cuenca, Luis-Pedro García, Domingo Giménez
- Parallel Computing
- 2014

Scientific and engineering problems are solved with large parallel systems In some cases those systems are NUMA A large number of cores Share a hierarchically organized memory Kernel of the… (More)

- Javier Cuenca, Luis-Pedro García, Domingo Giménez, Francisco-José Herrera
- Concurrency and Computation: Practice and…
- 2017

- Luis-Pedro García, Javier Cuenca, Francisco-José Herrera, Domingo Giménez
- PMAM@PPoPP
- 2016

Computational systems are nowadays composed of basic computational components which share multiprocessors and coprocessors of different types, typically several GPUs or MICs. The software previously… (More)

- Luis-Pedro García, Javier Cuenca, Domingo Giménez
- PPAM
- 2007

The performance of parallel linear algebra routines can be improved automatically using different methods. Our technique is based on the modellisation of the execution time of each routine, using… (More)

- Gregorio Bernabé, Javier Cuenca, Luis-Pedro García, Domingo Giménez
- The Journal of Supercomputing
- 2014

This paper presents an enhanced auto-optimization method to run the 3D-Fast Wavelet Transform on different computing units in a system (GPU, MIC, CPU). The proposed method automatically selects a set… (More)