# A novel multiple-walk parallel algorithm for the Barnes–Hut treecode on GPUs – towards cost effective, high performance N-body simulation

title={
A novel multiple-walk parallel algorithm for the Barnes–Hut treecode on GPUs – towards
cost effective, high performance N-body simulation
},
author={Tsuyoshi Hamada and Keigo Nitadori and Khaled Benkrid and Yousuke Ohno and Gentaro Morimoto and Tomonari Masada and Yuichiro Shibata and Kiyoshi Oguri and Makoto Taiji},
journal={Computer Science - Research and Development},
year={2009},
volume={24},
pages={21-31}
}
• Published 20 May 2009
• Computer Science
• Computer Science - Research and Development
AbstractRecently, general-purpose computation on graphics processing units (GPGPU) has become an increasingly popular field of study as graphics processing units (GPUs) continue to be proposed as high performance and relatively low cost implementation platforms for scientific computing applications. Among these applications figure astrophysical N-bodysimulations, which form one of the most challenging problems in computational science. However, in most reported studies, a simple $\mathcal… 30 Citations ### Barnes-hut treecode on GPU • Computer Science 2010 IEEE International Conference on Progress in Informatics and Computing • 2010 A new implementation of tree-algorithm on GPU using CUDA, which has obtained more than 100X speedup when computing forces between bodies, and rises up a new method to build tree in this algorithm, making the performance even better. ### 42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence • Physics, Computer Science Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis • 2009 The present method calculates the O(N log N) treecode and O (N) fast multipole method (FMM) on the GPUs with unprecedented efficiency and demonstrates the performance of the method by choosing one standard application -a gravitational N-body simulation- and one non-standard application -simulation of turbulence using vortex particles. ### A novel parallel algorithm for near-field computation in N-body problem on GPU • Computer Science • 2011 A novel efficient parallel algorithm for the near-field computation in N-body problem on the Graphics Processing Unit (GPU) architecture is proposed, based on the Newton’s third law and Z-order Space Filling Curve. ### The algorithm mapping of the near-field computation in N-body problem on GPU • Computer Science • 2011 This paper discusses the principle of mapping algorithm efficiently on to the Graphics Processing Unit (GPU) architecture from the aspects of task partition and data access by researching the ### 190 TFlops Astrophysical N-body Simulation on a Cluster of GPUs • Computer Science, Physics 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis • 2010 The results of a hierarchical N-body simulation on DEGIMA, a cluster of PCs with 576 graphic processing units (GPUs) and using an InfiniBand interconnect using Infini band are presented. ### On the parallelization and performance analysis of Barnes–Hut algorithm using Java parallel platforms • Computer Science SN Applied Sciences • 2020 Multi-core processors provide time-efficient and cost-effective solutions to execute the algorithms for complex physical systems. However, to efficiently exploit the processing capabilities of the ### Parallel time-space processing model based fast N-body simulation on GPUs • Computer Science, Physics PMAM '13 • 2013 A novel parallel implementation of N-body gravitational simulation on GPUs is presented, and the experimental results show that this method achieves an acceleration of 413 compared with CPU, and an acceleration up to 5.5 times compared with other GPU based methods. ### Directionally unsplit hydrodynamic schemes with hybrid MPI/OpenMP/GPU parallelization in AMR • Computer Science Int. J. High Perform. Comput. Appl. • 2012 A hybrid MPI/OpenMP model is investigated, which enables the full exploitation of the computing power in a heterogeneous CPU/GPU cluster and significantly improves the overall performance. ## References SHOWING 1-10 OF 30 REFERENCES ### GPGPU: general-purpose computation on graphics hardware • Computer Science SC • 2006 The graphics processor (GPU) on today's commodity video cards has evolved into an extremely powerful and flexible processor. Modern graphics architectures provide tremendous memory bandwidth and ### The Chamomile Scheme: An Optimized Algorithm for N-body simulations on Programmable Graphics Processing Units • Computer Science • 2007 An algorithm named "Chamomile Scheme" is presented, fully optimized for calculating gravitational interactions on the latest programmable Graphics Processing Unit (GPU), NVIDIA GeForce8800GTX, which has small but fast shared memories and floating point arithmetic hardware but only for single precision. ### A hierarchical O(N log N) force-calculation algorithm • Physics, Computer Science Nature • 1986 A novel method of directly calculating the force on N bodies that grows only as N log N is described, using a tree-structured hierarchical subdivision of space into cubic cells, each is recursively divided into eight subcells whenever more than one particle is found to occupy the same cell. ###$7.0/Mflops Astrophysical N-Body Simulation with Treecode on GRAPE-5

• Physics
ACM/IEEE SC 1999 Conference (SC'99)
• 1999
As an entry for the 1999 Gordon Bell price/performance prize, we report an astrophysical N-body simulation performed with a treecode on GRAPE-5 (Gravity Pipe 5) system, a special-purpose computer for

### Scan primitives for GPU computing

• Computer Science
GH '07
• 2007
Using the scan primitives, this work shows novel GPU implementations of quicksort and sparse matrix-vector multiply, and analyzes the performance of the scanPrimitives, several sort algorithms that use the scan Primitives, and a graphical shallow-water fluid simulation using the scan framework for a tridiagonal matrix solver.