Jorge González-Domínguez

Learn More
—The growing complexity in computer system hierarchies due to the increase in the number of cores per processor, levels of cache (some of them shared) and the number of processors per node, as well as the high-speed interconnects, demands the use of new optimization techniques and libraries that take advantage of their features. In this paper Servet, a(More)
To efficiently scale dense linear algebra problems to future exascale systems, communication cost must be avoided or overlapped. Communication-avoiding 2.5D algorithms improve scalability by reducing inter-processor data transfer volume at the cost of extra memory usage. Communication overlap attempts to hide messaging latency by pipelining messages and(More)
The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality, especially on hierarchical architectures such as multicore clusters. This paper describes UP-CBLAS, a parallel numerical library for dense matrix(More)
Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory(More)
The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality. This paper describes the implementation of efficient parallel dense triangular solvers in the PGAS language Unified Parallel C (UPC). The solvers are(More)
Unified Parallel C (UPC) is a Partitioned Global Address Space (PGAS) language that exhibits high performance and portability on a broad class of shared and distributed memory parallel architectures. This paper describes the design and implementation of a parallel numerical library for UPC built on top of the sequential BLAS routines. The developed library(More)
Unified Parallel C (UPC) is a Partitioned Global Address Space (PGAS) language whose popularity has increased during the last years owing to its high programmability and reasonable performance through an efficient exploitation of data locality, especially on hierarchical architectures like multicore clusters. However, the performance issues that arise in(More)
Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well as applications, can be dynamically provisioned on a pay-per-use basis. This paper presents a thorough evaluation of the I/O storage subsystem(More)
UNLABELLED Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe, a de novo parallel tool to remove(More)