Speeding up parallel GROMACS on high‐latency networks

@article{Kutzner2007SpeedingUP,
  title={Speeding up parallel GROMACS on high‐latency networks},
  author={Carsten Kutzner and David van der Spoel and Martin Fechner and Erik Lindahl and Udo W. Schmitt and B. L. de Groot and Helmut Grubm{\"u}ller},
  journal={Journal of Computational Chemistry},
  year={2007},
  volume={28}
}
We investigate the parallel scaling of the GROMACS molecular dynamics code on Ethernet Beowulf clusters and what prerequisites are necessary for decent scaling even on such clusters with only limited bandwidth and high latency. GROMACS 3.3 scales well on supercomputers like the IBM p690 (Regatta) and on Linux clusters with a special interconnect like Myrinet or Infiniband. Because of the high single‐node performance of GROMACS, however, on the widely used Ethernet switched clusters, the scaling… 
A Method to Accelerate GROMACS in Offload Mode on Tianhe-2 Supercomputer
TLDR
It is proposed that GROMACS could be arranged efficiently with CPU and the Intel® Xeon PhiTM Many Integrated Core (MIC) coprocessors at the same time, making the full use of Tianhe-2 supercomputer resources.
Porting Molecular Dynamics simulation to heterogeneous multi-core architecture
TLDR
Besides preserving the integrity of the MD applications and avoiding additional cost of code modification, the experimental result of porting MD software named Moldy to IBM CBE architecture shows that the ported MD software based on the proposed porting method obtains both higher speedup ratio and higher performance.
A GPU-Accelerated Fast Multipole Method for GROMACS: Performance and Accuracy
TLDR
It is found that FMM with a multipole order of 8 yields electrostatic forces that are as accurate as PME with standard parameters, and for typical mixed-precision simulation settings, FMM does not lead to an increased energy drift with multipole orders of 8 or larger.
Performance Analysis Cluster and GPU Computing Environment on Molecular Dynamic Simulation of BRV-1 and REM2 with GROMACS
TLDR
Evaluating the performance of GROMACS that runs on two different environment, Cluster computing resources and GPU based PCs shows that those run on GTX 470 is the best among the other type of GPUs and as well as the cluster computing resource.
Comparative NAMD benchmarking on BlueGene/P
TLDR
The experiments show that the linear increase of speed of calculation with the increase of number of processors has been obtained and no saturation is observed, but the data within the ArmGrid infrastructures shows the breakdown in scaling, which is depends on communication time between processors.
Gromita: A Fully Integrated Graphical User Interface to Gromacs 4
TLDR
Gromita is a cross-platform, perl/tcl-tk based, interactive front end designed to break the command line barrier and introduce a new user-friendly environment to run molecular dynamics simulations through Gromacs.
Provenance Services for Distributed Workflows
TLDR
This paper presents a service architecture that captures and stores provenance data from distributed, autonomous, replicated and heterogeneous resources that can be used to trace the history of the distributed execution process.
Thermally activated charge transport in microbial protein nanowires
TLDR
The results provide evidence for thermally activated multistep hopping as the mechanism that allows Geobacter pili to function as protein nanowires between the cell and extracellular electron acceptors.
Correlation Spectroscopy and Molecular Dynamics Simulations to Study the Structural Features of Proteins
In this work, we used a combination of fluorescence correlation spectroscopy (FCS) and molecular dynamics (MD) simulation methodologies to acquire structural information on pH-induced unfolding of
...
...

References

SHOWING 1-10 OF 45 REFERENCES
Optimization of Collective Communication Operations in MPICH
TLDR
The work on improving the performance of collective communication operations in MPICH is described, with results indicating that to achieve the best performance for a collective communication operation, one needs to use a number of different algorithms and select the right algorithm for a particular message size and number of processes.
Gigabit Ethernet: Technology and Applications for High-Speed LANs
TLDR
This book guides both users and developers through the complex issues involved in designing and deploying high-speed networks and helps network technologists and users grounded in the fundamentals of Ethernet find everything they need to understand completely the workings of the new Gigabit Ethernet system.
An empirical approach for efficient all-to-all personalized communication on Ethernet switched clusters
  • A. Faraj, Xin Yuan
  • Computer Science, Business
    2005 International Conference on Parallel Processing (ICPP'05)
  • 2005
TLDR
Experimental results show that the empirical approach generates routines that consistently achieve high performance on clusters with different network topologies, and in many cases, the automatically generated routines out-perform conventional AAPC implementations to a large degree.
LAM: An Open Cluster Environment for MPI
TLDR
This paper describes the unique aspects of one MPI implementation, the standard library definition for messagepassing communication, and its goals, which justify an MPI design that may at first seem burdensome.
A framework for collective personalized communication
TLDR
These strategies that reduce per-message cost to optimize AAPC are presented, and the computational overhead of the communication is substantially reduced, at least on machines such as PSC Lemieux, which sport a co-processor capable of remote DMA.
Gigabit ethernet: migrating to high-bandwidth LANs
TLDR
Gigabit Ethernet Technology: Introduction, Applications, Industry Trends and Technologies, and Scaling Gigabit Ethernet: Looking to the Future.
GROMACS: Fast, flexible, and free
TLDR
The software suite GROMACS (Groningen MAchine for Chemical Simulation) that was developed at the University of Groningen, The Netherlands, in the early 1990s is described, which is a very fast program for molecular dynamics simulation.
Car-Parrinello molecular dynamics on massively parallel computers.
  • J. Hutter, A. Curioni
  • Computer Science
    Chemphyschem : a European journal of chemical physics and physical chemistry
  • 2005
TLDR
This work presents strategies that have been used toently map the Car{Parrinello algorithm in the CPMD code to two emerging high-performance computing hardware platforms, namely, clustered shared-memory parallel servers and ultra-dense massively parallel computers, such as e.g. the IBM BlueGene/L.
...
...