Abstract

• We present a toolchain to transform legacy parallel applications. • Supports both C and Fortran applications with MPI or Global Arrays. • We employ techniques based on semantic patching and term rewriting. • We demonstrate improved performance for two case studies. Performance and scalability optimization of large HPC applications is currently a labor-intensive, manual process with very low productivity. Major difficulties come from the disaggregated environment for HPC application development: the compiler is only involved in local decisions (core or multithreaded domain), while a library-based, communication-oriented programming model realizes whole-machine parallelism. Realizing any major global change in such a disaggregated environment is very difficult and involves changing large portions of the source code. We present semi-automated techniques, based on structural analysis and rewriting, for performing global transformations on an HPC application source code. We present two case studies using the Self-Consistent Field (SCF) standalone benchmark as well as the Coupled Cluster (CCSD) module (2.9 million lines of Fortran code), a key module of the NWChem computational chemistry application. We demonstrate how structural rewriting techniques can be used to automate transformations that affect multiple sections of the application’s source code. We show that the transformations can be applied in a systematic fashion across the source code bases with minimal manual effort. These transformations improve the scalability of the SCF benchmark by more than two orders of magnitude and the performance of the full CCSD module by a factor of four.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call