Iain Bethune

Learn More
Ever-increasing core counts create the need to develop parallel algorithms that avoid closelycoupled execution across all cores. In this paper we present performance analysis of several parallel asynchronous implementations of Jacobi’s method for solving systems of linear equations, using MPI, SHMEM and OpenMP. In particular we have solved systems of over 4(More)
CP2K is a widely used application for atomistic simulation that can execute on a range of architectures. Consisting of more than one million lines of Fortran 95 code, the application is tested for correctness with a set of about 2,500 inputs using a dedicated regression testing environment. CP2K can be built with many compilers and executed on different(More)
CP2K is a freely available and increasingly popular Density Functional Theory code for the simulation of a wide range of systems. It is heavily used on many Cray XT systems, including ‘HECToR’ in the UK and ‘Monte Rosa’ in Switzerland. We describe performance optimisations made to the code in several key areas, including 3D Fourier Transforms, and present(More)
For many macromolecular systems the accurate sampling of the relevant regions on the potential energy surface cannot be obtained by a single, long Molecular Dynamics (MD) trajectory. New approaches are required to promote more efficient sampling. We present the design and implementation of the Extensible Toolkit for Advanced Sampling and analYsis (ExTASY)(More)
Iterative methods for solving large sparse systems of linear equations are widely used in many HPC applications. Extreme scaling of these methods can be difficult, however, since global communication to form dot products is typically required at every iteration. To try to overcome this limitation we propose a hybrid approach, where the matrix is partitioned(More)
This report presents the results of a HECToR dCSE project to improve the performance of CP2K, a freely available and popular Density Functional Theory code, on HECToR. Building on a recently implemented domain decomposition method, further optimisation of the code was performed, and significant performance gains were measured around 30% on 256 cores (for a(More)
Ever-increasing core counts create the need to develop parallel algorithms that avoid closely-coupled execution across cores. In this paper we present two case studies investigating the performance of several parallel asynchronous implementations of Jacobi’s method for solving systems of linear equations. Although conditions for the convergence of(More)
This report describes the work undertaken under PRACE-1IP to support the European scientific communities who make use of CP2K in their research. This was done in two ways – firstly, by improving the performance of the code for a wide range of usage scenarios. The updated code was then tested and installed on the PRACE CURIE supercomputer. We believe this(More)