Abstract
SummaryEnergy‐efficient scientific applications require insight into how high performance computing system features impact the applications' power and performance. This insight can result from the development of performance and power models. In this article, we use the modeling and prediction tool MuMMI (Multiple Metrics Modeling Infrastructure) and 10 machine learning methods to model and predict performance and power consumption and compare their prediction error rates. We use an algorithm‐based fault‐tolerant linear algebra code and a multilevel checkpointing fault‐tolerant heat distribution code to conduct our modeling and prediction study on the Cray XC40 Theta and IBM BG/Q Mira at Argonne National Laboratory and the Intel Haswell cluster Shepard at Sandia National Laboratories. Our experimental results show that the prediction error rates in performance and power using MuMMI are less than 10% for most cases. By utilizing the models for runtime, node power, CPU power, and memory power, we identify the most significant performance counters for potential application optimizations, and we predict theoretical outcomes of the optimizations. Based on two collected datasets, we analyze and compare the prediction accuracy in performance and power consumption using MuMMI and 10 machine learning methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.