Abstract

Improving computing performance and reducing energy consumption are a major concern in heterogeneous many-core systems. The thread count directly influences the computing performance and energy consumption for a multithread application running on a heterogeneous many-core system. For this work, we studied the interrelation between the thread count and the performance of applications to improve total energy efficiency. A prediction model of the optimum thread count, hereafter the thread count prediction model (TCPM), was designed by using regression analysis based on the program running behaviors and heterogeneous many-core architecture feature. Subsequently, a dynamic predictive thread mapping (DPTM) framework was proposed. DPTM uses the prediction model to estimate the optimum thread count and dynamically adjusts the number of active hardware threads according to the phase changes of the running program in order to achieve the optimal energy efficiency. Experimental results show that DPTM obtains a nearly 49% improvement in performance and a 59% reduction in energy consumption on average. Moreover, DPTM introduces about 2% additional overhead compared with traditional thread mapping for PARSEC(The Princeton Application Repository for Shared-Memory Computers) benchmark programs running on an Intel MIC (Many integrated core)heterogeneous many-core system.

Highlights

  • With the recent shift towards energy-efficient computing, the heterogeneous many-core system has emerged as a promising solution in the domain of high-performance computing [1]

  • In the emerging heterogeneous many-core systems composed of a host processor and co-processor, the host processor is used to deal with complex logical control tasks, and the co-processor is used to compute large-scale parallel tasks with high computing density and simple logical branch

  • Evaluation results show that dynamic predictive thread mapping (DPTM) improves the application performance by 48.6% and decreases energy consumption by 59% on average for PARSEC benchmark programs running on an Intel MIC heterogeneous system compared with the traditional thread mapping policy

Read more

Summary

Introduction

With the recent shift towards energy-efficient computing, the heterogeneous many-core system has emerged as a promising solution in the domain of high-performance computing [1]. Case IV: The performance speedup exhibits an irregular change with increasing thread count, as shown in the canneal and swaptions These observations clearly indicate the importance of the appropriate number of cores and threads for computing performance and energy efficiency in many-core systems [4]. The iterative searching searches the appropriate thread count by constantly testing and contrasting the performance of different thread counts This approach has high overhead and could not reflect the dynamic change behavior of the application. Evaluation results show that DPTM improves the application performance by 48.6% and decreases energy consumption by 59% on average for PARSEC benchmark programs running on an Intel MIC heterogeneous system compared with the traditional thread mapping policy. DPTM introduces about 2% additional overhead on average to predict and adjust the thread count

Related Work
Impact Factors on Computing Performance
Notations of Performance Metrics
Theoretical Basis of TCPM
TCPM Establishment
DPTM Mechanism
DPTM Framework
Sampling the Status Information
Detecting the Phase Changes of the Running Program
Threshold Values of the Performance Metric
Detection Algorithm of the Program Phase Changes
DPTM Framework Implementation
Experimental Environment
TCPM Prediction Accuracy Evaluation
DPTM Evaluation
Performance Speedup Evaluation
Cache Miss Evaluation
Energy Consumption Evaluation
Energy–Performance Efficiency Evaluation
Overhead Evaluation
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call