Abstract

AbstractCosmological N‐body simulations have been essential for astronomers to study the formation of nonlinear structures and hypotheses of dark matter, dark energy, etc. The scale of the problem naturally leads to extreme scenarios with billions or even trillions of particles, thus demanding massive computational power and highly efficient algorithms. In this paper, we present swPHoToNs, a Particle‐Mesh (PM) and Fast Multipole Method‐ (FMM) based code that can perform cosmological simulations with trillions of particles efficiently on the Sunway TaihuLight supercomputer. Our design includes three novel optimizations: (1) a multilevel domain decomposition and dynamic load‐balancing scheme; (2) a pipeline strategy for tree traversal and gravity calculation; (3) optimizations for both computation and MPI kernels with consideration of the hardware features. We manage to conduct cosmological simulations which contain up to 1.6 trillion particles, obtaining a sustained performance of 56.3 PFlops with a weak‐scaling parallel efficiency of 80.9% and a computational efficiency of 44.9%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call