Laércio Lima Pilla

Learn More
Graphics Processing Units (GPUs) offer high computational power but require high scheduling strain to manage parallel processes, which increases the GPU cross section. The results of extensive neutron radiation experiments performed on NVIDIA GPUs confirm this hypothesis. Reducing the application Degree Of Parallelism (DOP) reduces the scheduling strain but(More)
The parallelism in shared-memory systems has increased significantly with the advent and evolution of multicore processors. Current systems include several multicore and multithreaded processors with Non-Uniform Memory Access (NUMA) characteristics. These architectures require the adoption of two strategies for the efficient execution of parallel(More)
We have developed a model called MigBSP that controls processes rescheduling in BSP (Bulk Synchronous Parallel)applications. A BSP application is composed by one or more supersteps, each one containing both computation and communication phases followed by a synchronization barrier. Since the barrier waits for the slowest process, MigBSP’s final idea(More)
Multi-core nodes with Non-Uniform Memory Access (NUMA) are now a common architecture for high performance computing. On such NUMA nodes, the shared memory is physically distributed into memory banks connected by a network. Owing to this, memory access costs may vary depending on the distance between the processing unit and the memory bank. Therefore, a key(More)
In this paper we assess and discuss the radiation sensitivity of a set of HPC applications executed on NVIDIA K20 GPGPUs. The occurrence of both radiation-induced silent data corruption and functional interruption will be experimentally addressed for Hotspot, LavaMD, and Matrix Transponse. Each of the tested codes requires a proper computational power and(More)
Current multi-core machines feature a complex and hierarchical core topology, multiple levels of cache and memory subsystem with NUMA design. Although this design provides high processing power to parallel machines, it comes with the cost of asymmetric memory access latencies. Depending on the parallel application communication patterns, this asymmetry may(More)
Multi-core compute nodes with non-uniform memory access (NUMA) are now a common architecture in the assembly of large-scale parallel machines. On these machines, in addition to the network communication costs, the memory access costs within a compute node are also asymmetric. Ignoring this can lead to an increase in the data movement costs. Therefore, to(More)
The High Performance Computing (HPC) community aimed for many years to increase performance regardless of energy consumption. Until the end of the decade, a next generation of HPC systems is expected to reach sustained performances of the order of exaflops. This requires many times more performance compared to the fastest supercomputers of today. Achieving(More)