Alessandro Fanfarillo

Learn More
Coarray Fortran is a set of features of the Fortran 2008 standard that make Fortran a PGAS parallel programming language. Two commercial compilers currently support coarrays: Cray and Intel. Here we present two coarray transport layers provided by the new OpenCoarrays project: one library based on MPI and the other on GASNet. We link the GNU Fortran(More)
Hybrid nodes containing GPUs are rapidly becoming the norm in parallel machines. We have conducted some experiments regarding how to plug GPU-enabled computational kernels into PSBLAS, a MPI-based library specifically geared towards sparse matrix computations. In this paper, we present our findings on which strategies are more promising in the quest for the(More)
Hybrid GPU/CPU clusters are becoming very popular in the scientific computing community, as attested by the number of such systems present in the Top 500 list. In this paper, we address one of the key algorithms for scientific applications: the computation of sparse matrix-vector products that lies at the heart of iterative solvers for sparse linear(More)
Coarray Fortran is a set of features of the Fortran 2008 standard which makes Fortran a PGAS language. Currently, the coarray support is provided mainly by commercial compilers like Cray and Intel. In this work we present two coarray implementations on the GNU Fortran compiler. We present a performance comparison between our coarray implementations and(More)
The multiplication of a sparse matrix by a dense vector (SpMV) is a centerpiece of scientific computing applications: it is the essential kernel for the solution of sparse linear systems and sparse eigenvalue problems by iterative methods. The efficient implementation of the sparse matrix-vector multiplication is therefore crucial and has been the subject(More)
MPI-3.1 is currently the most recent version of the MPI standard. It adds important extensions to MPI-2, including a simplified semantic for the one-sided communication routines and a new tool interface, capable of exposing performance data of the MPI implementation to users and libraries. These and other new features make MPI-3 a good candidate for being(More)
In order to reach challenging performance goals, computer architectures will change significantly in the next future. Heterogeneous chips, equipped with different types of cores and memory will compel application developers to deal with irregular communication patterns, high parallelism, and unexpected behaviors. Load balancing among the heterogeneous(More)
Accelerators such as NVIDIA GPUs and Intel MICs are currently provided as co-processor devices, usable only through a CPU host. For Intel MICs it is planned that this constraint will be lifted in the near future: CPU and accelerator(s) will then form a single, many-core, processor capable of peak performance of several Teraflops with high energy efficiency.(More)
  • 1