High-performance Physics Simulations Using Multi-core CPUs and GPGPUs in a Volunteer Computing Context

@article{Karimi2010HighperformancePS,
  title={High-performance Physics Simulations Using Multi-core CPUs and GPGPUs in a Volunteer Computing Context},
  author={Kamran Karimi and Neil G. Dickson and Firas Hamze},
  journal={The International Journal of High Performance Computing Applications},
  year={2010},
  volume={25},
  pages={61 - 69}
}
  • K. KarimiN. DicksonF. Hamze
  • Published 31 March 2010
  • Computer Science
  • The International Journal of High Performance Computing Applications
This paper presents two conceptually simple methods for parallelizing a Parallel Tempering Monte Carlo simulation in a distributed volunteer computing context, where computers belonging to the general public are used. The first method uses conventional multi-threading. The second method uses CUDA, a graphics card computing system. Parallel Tempering is described, and challenges such as parallel random number generation and mapping of Monte Carlo chains to different threads are explained. While… 

Figures and Tables from this paper

Importance of explicit vectorization for CPU and GPU software performance

Implementing data parallelisation in a Nested-Sampling Monte Carlo algorithm

This paper reports work on the parallelisation of a Nested Sampling Monte Carlo algorithm used in the nuclear physics field of hadron spectroscopy, using both OpenCL and OpenMP to parallelise the existing code.

Anytime parallel tempering

This work adopts the Anytime Monte Carlo framework for parallel tempering, and shows the methodology for exchanges at real-time deadlines does not introduce a bias and leads to significant performance enhancements over the naïve approach of idling until every processor’s local moves complete.

A Performance Comparison of CUDA and OpenCL

This paper uses complex, near-identical kernels from a Quantum Monte Carlo application to compare the performance of CUDA and OpenCL and shows that when using NVIDIA compiler tools, converting a CUDA kernel to an OpenCL kernel involves minimal modifications.

Solving Batched Linear Programs on GPU and Multicore CPU

The design and CUDA implementation of the batched LP solver library is presented, keeping memory coalescent access, reduced CPU-GPU memory transfer latency and load balancing as the goals, and the performance is compared against sequential solving in the CPU using an open source solver GLPK (GNU Linear Programming Kit).

Simultaneous Solving of Batched Linear Programs on a GPU

This paper proposes a batched LP solver in CUDA to accelerate such applications and demonstrates its utility in a use case - state-space exploration of models of control systems design.

Un modèle de transition logico-matérielle pour la simplification de la programmation parallèle. (A software-hardware bridging model for simplifying parallel programming)

This thesis proposed a bridging model named SGL for modelling heterogeneous parallel architectures and parallel algorithms, and an implementation of parallel skeletons based on SGL model for high-performance computing, and improves the clarity of algorithms performance analysis.

Grid Computing Technology

The Grid is not only a low level organization for secondary computation, but can also simplify and enable material and knowledge sharing at the higher semantic levels, to support knowledge mixing and distribution.

A high efficient and fast kNN algorithm based on CUDA

This paper implements a CUDAbased kNN algorithm, and compares its performance with CPU-only kNN algorithms using single- Precision and double-precision datatype on classifying celestial objects to demonstrate that CUDA can speedup kNNgorithm effectively and could be useful in astronomical applications.

References

SHOWING 1-10 OF 20 REFERENCES

Quantum Monte Carlo on graphical processing units

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Using OpenMP describes how to use OpenMP in full-scale applications to achieve high performance on large-scale architectures, and describes how OpenMP is translated into explicitly multithreaded code, providing a valuable behind-the-scenes account of OpenMP program performance.

Using OpenMP - portable shared memory parallel programming

Using OpenMP describes how to use OpenMP in full-scale applications to achieve high performance on large-scale architectures, and explains how OpenMP is translated into explicitly multithreaded code, providing a valuable behind-the-scenes account of OpenMP program performance.

Quantum Monte Carlo: Origins, Development, Applications

This book takes a similar approach to Henry Schaefers classic book Quantum Chemistry, collecting summaries of some of the most important papers in the quantum Monte Carlo literature, tying everything together with analysis and discussion of applications.

Robust Parameter Selection for Parallel Tempering

This paper describes an algorithm for selecting parameter values at which to measure equilibrium properties with Parallel Tempering Monte Carlo simulation, starting from an initial set of parameter values that greatly improves equilibration.

Distributed System Design

  • Xiao Chen
  • Computer Science
    Scalable Comput. Pract. Exp.
  • 2000
With a level of expertise simply unmatched in the field, The Distributed System Design provides readers with a solid foundation to understand and further explore in this increasingly important area of technology.

Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator

A new algorithm called Mersenne Twister (MT) is proposed for generating uniform pseudorandom numbers, which provides a super astronomical period of 2 and 623-dimensional equidistribution up to 32-bit accuracy, while using a working area of only 624 words.

Exchange Monte Carlo Method and Application to Spin Glass Simulations

We propose an efficient Monte Carlo algorithm for simulating a “hardly-relaxing” system, in which many replicas with different temperatures are simultaneously simulated and a virtual process exchan...

Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics)

The authoritative resource for understanding the power behind Monte Carlo Methods and a new co-author has been added to enliven the writing style and to provide modern day expertise on new topics.

Training a Binary Classifier with the Quantum Adiabatic Algorithm

This paper describes how to make the problem of binary classification amenable to quantum computing, and finds that the resulting classifier outperforms a widely used state-of-the-art method, AdaBoost, on a variety of benchmark problems.