Learn More
The increase in performance of the last generations of graphics processors (GPUs) has made this class of platform a coprocessing tool with remarkable success in certain types of operations. In this paper we evaluate the performance of the Level 3 operations in CUBLAS, the implementation of BIAS for NVIDIAreg GPUs with unified architecture. From this study,(More)
The increasing computing requirements for GPUs (Graphics Processing Units) have favoured the design and marketing of commodity devices that nowadays can also be used to accelerate general purpose computing. Therefore, future high performance clusters intended for HPC (High Performance Computing) will likely include such devices. However, high-end GPU-based(More)
Solving Dense Linear Systems on GPUs 1 Barrachina et al. The power and versatility of modern GPU have transformed them into the first widely extended HPC platform Solving Dense Linear Systems on GPUs 2 Barrachina et al. The solution of dense linear systems arises in a wide variety of fields How does the new generation of GPUs adapt to this type of problems?(More)
This paper analyzes the performance of two parallel algorithms for solving the linear-quadratic optimal control problem arising in discrete-time periodic linear systems. The algorithms perform a sequence of orthogonal reordering transformations on formal matrix products associated with the periodic linear system, and then employs the so-called matrix disk(More)
In this paper we detail the key features, architectural design, and implementation of rCUDA, an advanced framework to enable remote and transparent GPGPU acceleration in HPC clusters. rCUDA allows decoupling GPUs from nodes, forming pools of shared accelerators, which brings enhanced flexibility to cluster configurations. This opens the door to(More)
In this paper we analyze the trade-off between energy and performance for a data-parallel execution of the LU factorization with partial pivoting on a multi-core processor. To improve energy efficiency, we adapt the runtime in charge of controlling the concurrent execution of the algorithm to leverage DVFS and block idle threads. For a CPU-bounded operation(More)
ISDEP (Integrator of Stochastic Differential Equations for Plasmas) is a Monte Carlo code that solves the plasma dynamics in a fusion device and perfectly scales on distributed computing platforms. Montera is a recent framework developed for achieving Grid efficient executions of Monte Carlo applications, as ISDEP is. In this work, the improvement of(More)