Abstract

Thermal stress including temperature gradients in time and space, as well as thermal cycling, influences lifetime reliability and performance of modern multiprocessor systems-on-chip (MPSoCs). Conventional power and temperature management techniques considering the peak temperature/power consumption do not provide a comprehensive solution to avoid high spatial and temporal thermal variations. This work presents TheSPoT, a novel multilevel thermal stress-aware power and thermal management approach for MPSoCs. At the top level, core consolidation and deconsolidation is performed based on peak temperature, thermal stress, and power consumption constraints. These constraints are also used at the next level, where operating frequencies are determined. At this level, we obtain optimal core frequencies by solving a convex optimization problem. However, thereafter, to reduce the runtime overhead in large MPSoCs, we alternatively propose to use a fast heuristic algorithm. The efficacy of the proposed approaches in reducing the thermal cycles and temporal/spatial temperature gradients is evaluated by comparing the results with the state-of-the-art methods. The evaluation performed on 4-core, 8-core, and 16-core MPSoCs, using PARSEC benchmarks, reveals a considerable reduction in thermal stress. For the 8-core MPSoC case study, on average, for the proposed heuristic(optimal) approach, the mean time to failure improved by 47(35)% compared to the state-of-the-art techniques with only 6(4)% performance degradation. Also, our simulations show that TheSPoT is more efficient in thermal stress reduction when more heterogeneous workloads are used.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call