Abstract

Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. These supercomputers should provide high throughput per unit of power to be sustainable in terms of operating cost and failure rates. In this paper, a power-bounded strategy is proposed that maximizes parallel application performance under a given power constraint. The strategy dynamically allocates power to core, uncore, and memory power domains within a node to maximize performance under a given power budget. Experiments on a 20-core Haswell-EP platform for a real-world parallel application GAMESS demonstrate that the proposed strategy delivers performance within 4% of the best possible performance for as much as 25% reduction in the minimum power budget required for maximum performance.

Highlights

  • Power consumption has become a major concern for modern and future supercomputers

  • Experiments on a 20-core Haswell-EP platform for a real-world parallel application GAMESS demonstrate that the proposed strategy delivers performance within 4% of the best possible performance for as much as 25% reduction in the minimum power budget required for maximum performance

  • NAS benchmarks (NPB) [18] and GAMESS were used for evaluating the efficacy of the proposed runtime strategy and to validate the modeling effort as NPB provides a good mix of compute- and memory-intensive benchmarks to test the core, uncore and DRAM power limiting addressed in this work

Read more

Summary

Introduction

Power consumption has become a major concern for modern and future supercomputers. For the current topmost petascale computing platforms in the world, it is typical to consume power on the order of several megawatts as depicted in the biannual TOP 500 list, which may cost on the order of several million dollars. The present paper adds the PP1 (uncore) domain, to the work described in [2], to solve this problem and proposes a power-bounded runtime strategy, which maximizes the parallel application performance under a given power budget. The work presented here may be considered as a combination of [2] and [6] because it determines optimal values for both uncore and core frequencies with the goal to distribute a given power budget to hardware components such that the application performance is maximized. Proposing a runtime power-bounded strategy to maximize parallel application performance under a given power budget by carefully allocating power to PKG, DRAM and uncore domains.

Related Work
Power Allocation Priority
Performance and Power Modeling
Performance Model
Power Model
Runtime Power-Bounded Strategy
Experimental Results
Overview of GAMESS
Experiment Setup
Strategy-Guided Performance under a Power Budget
Minimum Power Budget for GAMESS Calculations
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call