Abstract

HPC system design and operation are challenged by the critical requirements for signicant advances in eciency, scalability, user productivity, and performance portability, even at the end of Moore's Law with approaching nano-scale semiconductor technology. Conventional practices employ distributed memory message passing programming interfaces, sometimes combining second level thread-based intra shared memory node interfaces such as OpenMP or with means of controlling heterogeneous components such as OpenCL for GPUs. While these methods include some modest runtime control, they are principally course grained and statically scheduled. Yet, performance for many real-world applications yield eciencies of less than 10% although some benchmarks may achieve 80% eciency or better (e.g., HPL). To address these challenges, strategies employing runtime software systems are being pursued to exploit information about the status of the application and the system hardware operation throughout the execution for purposes of introspection to guide the task scheduling and resource management in support of dynamic adaptive control. Runtime systems provide adaptive means to reduce the eects of starvation, latency, overhead, and contention. While each is unique in its details, many share common properties such as multi-tasking either preemptive or non-preemptive, message-driven computation such as active messages, sophisticated ne-grain synchronization such as dataow and futures contructs, global name or address spaces, and control policies for optimizing task scheduling in part to address the uncertainty of asynchrony. This survey will identify key parameters and properties of modern and sometimes experimental runtime systems actively employed today and provide a detailed description, summary, and comparison within a shared space of dimensions. It is not the intent of this paper to determine which is better or worse but rather to provide sucient detail to permit the reader to select among them according to individual need.

Highlights

  • A runtime system or just “runtime” is a software package that resides between the operating system (OS) and the application programming interface (API) and compiler

  • This paper is a survey of runtime software systems being developed and employed for high performance computing (HPC) to improve the efficiency and scalability of supercomputers, at least for important classes of enduser applications

  • Runtime system software packages are emerging as an augmenting way to possibly dramatically improve efficiency and scalability, at least for important classes of applications and hardware systems

Read more

Summary

A Survey

OpenMP application programming interface (November 2015), version 4.5, http://www. The PaRSEC generic framework for architecture aware scheduling and management of micro-tasks (Dec 2015), version 2.0.0 http://icl.cs.utk.edu/parsec/index.html, accessed: 2017-02-15. Argobots: a lightweight low-level threading/tasking framework (Nov 2016), version 1.0a1 http://www.argobots.org/, accessed: 2017-02-15. BOLT: a lightning-fast OpenMP implementation (Nov 2016), version 1.0a1 http://www. GASNet low-level networking layer (Oct 2016), version 1.28.0, https://gasnet.lbl.gov/, accessed: 2017-02-15. HPX (July 2016), version 0.9.99, http://stellar.cct.lsu.edu/, accessed: 2017-02-15. HPX-5 (Nov 2016), version 4.0.0 http://hpx.crest.iu.edu/, accessed: 2017-02-15. The Mercurium source-to-source compilation infrastructure (June 2016), version 2.0.0 https://pm.bsc.es/mcxx, accessed: 2017-02-15. The Nanos++ runtime system (June 2016), version 0.10 https://pm.bsc.es/nanox, accessed: 2017-02-15. Omni (Nov 2016), version 1.1.0 http://omni-compiler.org/, accessed: 2017-02-15. The OmpSs programming model (June 2016), version 16.06 https://pm.bsc.es/ompss, accessed: 2017-02-15. Intel R Threading Building Blocks (Intel R TBB) (2017), website, http://www. Intel R Threading Building Blocks (Intel R TBB) (2017), website, http://www. threadingbuildingblocks.org, accessed: 2017-02-15

Introduction
Drivers for HPC Runtime Systems
Key Functionality
Multi-Threading
Name Spaces and Addressing
Message-Driven Computing
Synchronization
Major Runtime Exemplars
Legion
PaRSEC
Qthreads
OpenMP and OpenACC
Argobots
XcalableMP
Conclusions and Future Work
Findings

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.