Ernesto Dufrechu

Learn More
In this paper we present new hybrid CPU-GPU routines to accelerate the solution of linear systems, with band coefficient matrix, by off-loading the major part of the computations to the GPU and leverag-ing highly tuned implementations of the BLAS for the graphics processor. Our experiments with an nVidia S2070 GPU report speed-ups up to 6× for the hybrid(More)
Lyapack is a package for the solution of large-scale sparse problems arising in control theory. The package has a modular design, and is implemented as a Matlab toolbox, which renders it easy to utilize, modify and extend with new functionality. However, in general, the use of Matlab in combination with a general-purpose multi-core architecture (CPU) offers(More)
The solution of large-scale Lyapunov equations is an important tool for the solution of several engineering problems arising in optimal control and model order reduction. In this work, we investigate the case when the coefficient matrix of the equations presents a band structure. Exploiting the structure of this matrix, we can achive relevant reductions in(More)
Uruguay is currently undergoing a gradual process of inclusion of wind energy in its matrix of electric power generation. In this context, a computational tool has been developed to predict the electrical power that will be injected into the grid. The tool is based on the Weather Research and Forecasting (WRF) numerical model, which is the performance(More)
In this paper, we address the exploitation of data parallelism for the solution of sparse symmetric positive definite linear systems via iterative methods on Graphics Processing Units (GPUs). In particular, we accelerate the preconditioned CG-based iterative solver underlying the incomplete LU decomposition package (ILUPACK) by off-loading the most(More)
Linear algebra operations arise in a myriad of scientific and engineering applications and, therefore, their optimization is targeted by a significant number of high performance computing research efforts. In particular, the matrix multiplication and the solution of linear systems are two key problems with efficient implementations (or kernels) for a(More)
We analyze the efficiency of servers equipped with state-of-the-art general-purpose multicore processors as well as platforms based on accelerators such as graphics processing units (GPUs) and the Intel Xeon Phi. Following the proposal recently advocated in the High Performance Conjugate Gradient (HPCG) benchmark, we leverage for this purpose efficient(More)