Parallelization of a treecode

Riccardo Valdarnini

doi:10.1016/s1384-1076(03)00057-5

Abstract

I describe here the performance of a parallel treecode with individual particle timesteps. The code is based on the Barnes–Hut algorithm and runs cosmological N-body simulations on parallel machines with a distributed memory architecture using the MPI message-passing library. For a configuration with a constant number of particles per processor the scalability of the code was tested up to P=128 processors on an IBM SP4 machine. In the large P limit the average CPU time per processor necessary for solving the gravitational interactions is ∼10% higher than that expected from the ideal scaling relation. The processor domains are determined every large timestep according to a recursive orthogonal bisection, using a weighting scheme which takes into account the total particle computational load within the timestep. The results of the numerical tests show that the load balancing efficiency L of the code is high (≳90%) up to P=32, and decreases to L∼80% when P=128. In the latter case it is found that some aspects of the code performance are affected by machine hardware, while the proposed weighting scheme can achieve a load balance as high as L∼90% even in the large P limit.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallelization of a treecode

Abstract

Talk to us

Similar Papers

More From: New Astronomy

Lead the way for us

Journal: New Astronomy	Publication Date: Apr 16, 2003
Citations: 2

Similar Papers

True Load Balancing for Matricized Tensor Times Khatri-Rao Product
Nabil Abubaker ... Cevdet Aykanat
IEEE Transactions on Parallel and Distributed Systems | VOL. 32
Nabil Abubaker, et. al.Nabil Abubaker ... Cevdet Aykanat
25 Jan 2021
IEEE Transactions on Parallel and Distributed Systems | VOL. 32

Bounds on Multiprocessing Timing Anomalies
R L Graham
SIAM Journal on Applied Mathematics | VOL. 17
R L GrahamR L Graham
01 Mar 1969
SIAM Journal on Applied Mathematics | VOL. 17

Parallel architectures for processing high speed network signaling protocols
D Ghosal ... T.V Lakshman
IEEE/ACM Transactions on Networking | VOL. 3
D Ghosal, et. al.D Ghosal ... T.V Lakshman
01 Jan 1995
IEEE/ACM Transactions on Networking | VOL. 3

Load balancing and locality in hierarchical N-body algorithms on distributed memory architectures
F Baiardi ... M Paoli
-
F Baiardi, et. al.F Baiardi ... M Paoli
01 Jan 1998
01 Jan 1998

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallelization of a treecode

Abstract

Talk to us

Similar Papers

More From: New Astronomy