# High-performance Physics Simulations Using Multi-core CPUs and GPGPUs in a Volunteer Computing Context

@article{Karimi2010HighperformancePS, title={High-performance Physics Simulations Using Multi-core CPUs and GPGPUs in a Volunteer Computing Context}, author={Kamran Karimi and Neil G. Dickson and Firas Hamze}, journal={The International Journal of High Performance Computing Applications}, year={2010}, volume={25}, pages={61 - 69} }

This paper presents two conceptually simple methods for parallelizing a Parallel Tempering Monte Carlo simulation in a distributed volunteer computing context, where computers belonging to the general public are used. The first method uses conventional multi-threading. The second method uses CUDA, a graphics card computing system. Parallel Tempering is described, and challenges such as parallel random number generation and mapping of Monte Carlo chains to different threads are explained. While…

## 21 Citations

### Importance of explicit vectorization for CPU and GPU software performance

- Computer ScienceJ. Comput. Phys.
- 2011

### Implementing data parallelisation in a Nested-Sampling Monte Carlo algorithm

- Computer Science2013 International Conference on High Performance Computing & Simulation (HPCS)
- 2013

This paper reports work on the parallelisation of a Nested Sampling Monte Carlo algorithm used in the nuclear physics field of hadron spectroscopy, using both OpenCL and OpenMP to parallelise the existing code.

### Anytime parallel tempering

- Computer ScienceStat. Comput.
- 2021

This work adopts the Anytime Monte Carlo framework for parallel tempering, and shows the methodology for exchanges at real-time deadlines does not introduce a bias and leads to significant performance enhancements over the naïve approach of idling until every processor’s local moves complete.

### Accelerating atomistic calculations of quantum energy eigenstates on graphic cards

- Computer ScienceComput. Phys. Commun.
- 2014

### A Performance Comparison of CUDA and OpenCL

- Computer ScienceArXiv
- 2010

This paper uses complex, near-identical kernels from a Quantum Monte Carlo application to compare the performance of CUDA and OpenCL and shows that when using NVIDIA compiler tools, converting a CUDA kernel to an OpenCL kernel involves minimal modifications.

### Solving Batched Linear Programs on GPU and Multicore CPU

- Computer ScienceArXiv
- 2016

The design and CUDA implementation of the batched LP solver library is presented, keeping memory coalescent access, reduced CPU-GPU memory transfer latency and load balancing as the goals, and the performance is compared against sequential solving in the CPU using an open source solver GLPK (GNU Linear Programming Kit).

### Simultaneous Solving of Batched Linear Programs on a GPU

- Computer ScienceICPE
- 2019

This paper proposes a batched LP solver in CUDA to accelerate such applications and demonstrates its utility in a use case - state-space exploration of models of control systems design.

### Un modèle de transition logico-matérielle pour la simplification de la programmation parallèle. (A software-hardware bridging model for simplifying parallel programming)

- Computer Science
- 2013

This thesis proposed a bridging model named SGL for modelling heterogeneous parallel architectures and parallel algorithms, and an implementation of parallel skeletons based on SGL model for high-performance computing, and improves the clarity of algorithms performance analysis.

### Grid Computing Technology

- Computer Science
- 2014

The Grid is not only a low level organization for secondary computation, but can also simplify and enable material and knowledge sharing at the higher semantic levels, to support knowledge mixing and distribution.

### A high efficient and fast kNN algorithm based on CUDA

- Computer ScienceAstronomical Telescopes + Instrumentation
- 2010

This paper implements a CUDAbased kNN algorithm, and compares its performance with CPU-only kNN algorithms using single- Precision and double-precision datatype on classifying celestial objects to demonstrate that CUDA can speedup kNNgorithm effectively and could be useful in astronomical applications.

## References

SHOWING 1-10 OF 20 REFERENCES

### Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

- Computer Science
- 2007

Using OpenMP describes how to use OpenMP in full-scale applications to achieve high performance on large-scale architectures, and describes how OpenMP is translated into explicitly multithreaded code, providing a valuable behind-the-scenes account of OpenMP program performance.

### Using OpenMP - portable shared memory parallel programming

- Computer ScienceScientific and engineering computation
- 2008

Using OpenMP describes how to use OpenMP in full-scale applications to achieve high performance on large-scale architectures, and explains how OpenMP is translated into explicitly multithreaded code, providing a valuable behind-the-scenes account of OpenMP program performance.

### Quantum Monte Carlo: Origins, Development, Applications

- Physics
- 2007

This book takes a similar approach to Henry Schaefers classic book Quantum Chemistry, collecting summaries of some of the most important papers in the quantum Monte Carlo literature, tying everything together with analysis and discussion of applications.

### Robust Parameter Selection for Parallel Tempering

- Physics
- 2010

This paper describes an algorithm for selecting parameter values at which to measure equilibrium properties with Parallel Tempering Monte Carlo simulation, starting from an initial set of parameter values that greatly improves equilibration.

### Distributed System Design

- Computer ScienceScalable Comput. Pract. Exp.
- 2000

With a level of expertise simply unmatched in the field, The Distributed System Design provides readers with a solid foundation to understand and further explore in this increasingly important area of technology.

### Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator

- Computer Science, MathematicsTOMC
- 1998

A new algorithm called Mersenne Twister (MT) is proposed for generating uniform pseudorandom numbers, which provides a super astronomical period of 2 and 623-dimensional equidistribution up to 32-bit accuracy, while using a working area of only 624 words.

### Exchange Monte Carlo Method and Application to Spin Glass Simulations

- Physics
- 1996

We propose an efficient Monte Carlo algorithm for simulating a “hardly-relaxing” system, in which many replicas with different temperatures are simultaneously simulated and a virtual process exchan...

### Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics)

- Computer Science
- 1981

The authoritative resource for understanding the power behind Monte Carlo Methods and a new co-author has been added to enliven the writing style and to provide modern day expertise on new topics.

### Training a Binary Classifier with the Quantum Adiabatic Algorithm

- Computer Science
- 2008

This paper describes how to make the problem of binary classification amenable to quantum computing, and finds that the resulting classifier outperforms a widely used state-of-the-art method, AdaBoost, on a variety of benchmark problems.