Abstract

This paper will evaluate the progress being made on achieving performance portability by a sub-set of ECP applications, or their related mini-apps, across a diverse spectrum of applications domains and approaches to achieving performance portability. The applications or mini-apps evaluated are AMR-Wind, HACC, SW4, GAMESS RI-MP2, XSBench, and TestSNAP. These codes are being redeveloped using the SYCL, OpenMP, RAJA, or Kokkos programming models, or the AMReX framework and in this paper we assess their performance portability across the AMD MI100, Intel Gen9, and NVIDIA A100 GPUs. Since each GPU has different performance characteristics we have utilized the roofline performance model to compute the performance efficiency and evaluate performance portability across the three platforms. The merits of different metrics for quantifying performance portability are considered and a metric based on the standard deviation of roofline efficiencies is proposed as a preferred metric. Finally, observations on developer productivity are made based on the experience gained working with these applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call