High-performance programming frameworks for numerical simulation

Zeyao Mo

doi:10.1093/nsr/nwv086

Abstract

Over the past 20 years, a variety of research activities on high-performance programming have taken place worldwide. General-purpose application programming interfaces (GAPIs) have been designed and standardized; for example, MPI and OpenMP are two successfulGAPIs in the field of numerical simulation. However, with the increase in performance of computer systems from teraflops to petaflops [1], these GAPIs will not be adequate to deal with emerging systems in the next 5–10 years. There are two main reasons for this [2]. First, GAPIs need to be significantly improved or redesigned to deal with the multilevel heterogeneous parallelism or deep memory hierarchies mandated by the power and microarchitecture considerations. Second, domain-specific abstractions for application programming interfaces (DAPIs) are essential to shield application developers from the complexity and availability of such GAPIs. To date, high-performance programming frameworks (HPPFs) have successfully been used in implementing DAPIs in the field of numerical simulation [2]. These frameworks can bridge the gap between application developers and GAPIs with the paradigm of ‘think in parallel and write sequentially’. Physics and numerical algorithms can be programmed sequentially based on a flowchart of the parallel code without any knowledge of GAPIs. Moreover, HPPFs allow parallel code to be ported from petaflop systems to emerging or exascale systems without having to be rewritten. In other words, HPPFs safeguard application software from changes due to evolving computer systems. Typically, HPPFs are in greater demand in institutes where there are more application domain programmers than high-performance programmers. Figure 1 depicts the inner workings of a HPPF. First, domain-specific data structures must be presented for the distribution of computation load toward potentially minimal data movement across thememory hierarchy for efficient implementation of numerical algorithms.Next, the domain-specific data dependencies of the numerical algorithms must be accurately described for the new data structures. Third, parallel computation models (PCMs) can be defined to depict different types of numerical computation phases. Here, each phase represents a type of data dependency among a set of sequential numerical computation entities. Fourth, parallel algorithms for data communication and load balancing must be designed for efficient implementation of the PCMs. Finally, DAPIs are defined as software components [3] using C++ constructs in which data structures are available and numerical computation entities are integrated. Then, the numerical algorithms can be implemented sequentially provided that they can be written as a flowchart comprising numerical computation phases for each existing type of PCM. Further, numerical algorithms can be classified into two types, namely, application aware and independent. The

Full Text