Analysis and parallel implementation of a forced N-body problem

C.E Torres,H Parishani,O Ayala,L.F Rossi,L.-P Wang

doi:10.1016/j.jcp.2013.03.008

Abstract

The understanding of particle dynamics in N-body problems is of importance to many applications in astrophysics, molecular dynamics and cloud/plasma physics where the theoretical representation results in a coupled system of equations for a large number of entities. This paper concerns algorithms for solving a specific N-body problem, namely, a system of disturbance velocities for hydrodynamically interacting particles in a particle-laden turbulent flow. The system is derived from the improved superposition method of [1]. Targeting for scalable computations on petascale computers, we have carried out a thorough study of a parallel implementation of GMRes with different features, such as preconditioners, matrix-free and parallel sparse representation of the matrix through 1D and 2D spatial domain decompositions. Gauss–Seidel method is also studied as a reference iterative algorithm. The range of conditions for efficiency and failure of each method is discussed in detail.Through perturbation analysis, we have conducted a series of experiments to understand the effect of particle sizes, interaction symmetry, inter-particle distances and interaction truncation on the eigenvalues and normality of the linear system. For situations where the system is ill-conditioned, we introduce a restricted Schwarz type preconditioner. We verified the parallel efficiency of the preconditioner using 1D domain decomposition on a parallel machine. A benchmark problem of particle laden turbulence at 5123 resolution with 2×106 particles is studied to understand the scalability of the proposed methods on parallel machines. We have developed a stable and highly scalable parallel solver with an affordable computational cost even for ill-conditioned systems through preconditioning. On 64 cores, using GMRes in 2D domain decomposition, we achieved a speed-up of ∼5.6x (relative to 1D domain decomposition on the same number of processors). Our complexity analysis showed that for large N-body problems, the proposed GMRes scheme scales well for moderate to large number of processors in current tera to petascale computers.

Full Text