Abstract

We investigate the performance increase provided by the Intel® Xeon Phi™ coprocessor in multiple replica molecular dynamics applications using a novel parallelisation scheme. The benefits of the proposed parallelisation scheme are demonstrated by glycine in water, a system of significant interest in the crystallisation simulation community. The molecular dynamics (MD) engine consists of initially serial LAMMPS and NAMD subroutines, and is subsequently modified and parallelised using a heterogeneous programming model, where each MPI rank is paired with a unique Intel® Xeon Phi™ coprocessor and CPU socket. The MD engine is parallelised using an OpenMP atom domain decomposition algorithm on the Intel® Xeon Phi™ coprocessor and OpenMP task parallelism on the host CPU socket. Using nodes with two Intel® Xeon Phi™ coprocessors, one per socket, we demonstrate that a factor of five reduction in the required computational resources is achieved per replica with the coprocessor, when compared against employing the standard spatial domain decomposition algorithm with no accelerator. Furthermore, the proposed parallelisation scheme achieves ideal weak scaling with respect to the number of employed MPI ranks (replicas). The Intel® Xeon Phi™ coprocessor not only allows us to the increase performance output per socket by a factor of five, when compared against no accelerators, but also significantly reduces the parallelisation complexity necessary to achieve this performance, as the Intel® Xeon Phi™ coprocessor operates using the simple OpenMP programming language.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call