A Macroscale Simulator for Exascale Software/Hardware Co-Design

Gilbert Hendry

doi:10.4172/2165-7866.1000123

Abstract

The next decade will see a rapid evolution of HPC node architectures as power and cooling constraints are limiting increases in microprocessor clock speeds and constraining data movement. Future and current HPC applications will have to change and adapt as node architectures evolve. The application of advanced exascale architecture simulators will play a crucial role for the design and optimization of future data intensive applications. In this paper, we present our imulation-based framework for analyzing the scalability and performance of massive interconnected networks.

Highlights

Developers of HPC software must navigate a challenging space of trade-offs with unforeseen effects on delivered application performance
Our target applications include: the Gyrokinetic Toroidal Code (GTC) [11,12], an application developed for fusion simulations, including efficient ITER designs; the Global Cloud Resolving Model (GCRM) [13], designed for climate simulations at unprecedented resolutions; and the Materials Science LS3DF [14] computation, which is applied to numerous nanoscience simulations including next-generation solar cell design
The application of hardware/software co-design for HPC systems will allow for a bi-directional optimization of design parameters where software specifications and behavior drive hardware design decisions and hardware constraints are better understood and accounted for in the implementation of effective application software

Summary

Introduction

Developers of HPC software must navigate a challenging space of trade-offs with unforeseen effects on delivered application performance. Understanding performance problems of multiprocessor programs with detailed information about the synchronization calls that affect the application’s scalability and efficiency. SST/macro supports two execution modes: trace-driven simulation mode and skeleton model-driven execution. SST/macro supports the following two trace file formats, both of which record execution information by linking the target application with a library that uses the PMPI [2] interface to intercept MPI calls. While trace driven simulation can run applications on the largest available machines and analyze the collected traces, skeleton model-driven simulation is required to scale simulations and interpolate application behavior on future exascale machines to be designed. The simulation is driven by skeleton applications, which are simplified models of actual HPC programs with enough communication and computation information to mimic the application’s behavior. Skeleton application simulation can evaluate efficiency and scalability at extremely different scales, which provides a powerful option for performance prediction of non-existing super-scalar systems

Communication models

Programming models

Compact applications

Findings

Conclusions and Future Work