Abstract
Despite the impending flattening of Moore's law, the system size, complexity, and length of molecular dynamics (MD) simulations keep on increasing, thanks to effective code parallelization and optimization combined with algorithmic developments. Going forward, exascale computing poses new challenges to the efficient execution and management of MD simulations. The diversity and rapid developments of hardware architectures, software environments, and MD engines make it necessary that users can easily run benchmarks to optimally set up simulations, both with respect to time-to-solution and overall efficiency. To this end, we have developed the software MDBenchmark to streamline the setup, submission, and analysis of simulation benchmarks and scaling studies. The software design is open and as such not restricted to any specific MD engine or job queuing system. To illustrate the necessity and benefits of running benchmarks and the capabilities of MDBenchmark, we measure the performance of a diverse set of 23 MD simulation systems using GROMACS 2018. We compare the scaling of simulations with the number of nodes for central processing unit (CPU)-only and mixed CPU-graphics processing unit (GPU) nodes and study the performance that can be achieved when running multiple simulations on a single node. In all these cases, we optimize the numbers of message passing interface (MPI) ranks and open multi-processing (OpenMP) threads, which is crucial to maximizing performance. Our results demonstrate the importance of benchmarking for finding the optimal system and hardware specific simulation parameters. Running MD simulations with optimized settings leads to a significant performance increase that reduces the monetary, energetic, and environmental costs of MD simulations.
Highlights
Molecular dynamics (MD) simulations have become an integral part of the molecular life sciences and material sciences
We identify numbers of message passing interface (MPI) ranks and open multi-processing (OpenMP) threads that produce the best performance for a range of system sizes, study the benefits of hyperthreading, and analyze when it is beneficial to use central processing unit (CPU)-only or mixed CPU–graphics processing unit (GPU) nodes and when to run multiple simulations on a single node
Our results show that the dependence of the performance on the number of MPI ranks is different for CPU-only nodes and for mixed CPU–GPU nodes
Summary
Molecular dynamics (MD) simulations have become an integral part of the molecular life sciences and material sciences. Current compute clusters are composed of compute nodes, each containing at least one CPU, an optional GPU, as well as gigabytes of dedicated random-access memory (RAM) These nodes are connected in a network such that data can be exchanged between nodes and calculations can be performed in parallel on multiple nodes. A single physical core can often perform two computations at the same time, a feature called “hyperthreading.” When enabled, the number of physical cores is virtually doubled, i.e., for each physical core, two “logical cores” are introduced To use these heterogeneous resources efficiently and run calculations in parallel, two interfaces are widely used: message passing interface (MPI) and open multi-processing (OpenMP).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.