Dynamic thermal management (DTM) techniques are being widely used for attenuation of thermal hot spots in many-core systems. Conventionally, DTM techniques are analyzed using simulation and emulation methods, which are in-exhaustive due to their inherent limitations and cannot provide for a comprehensive comparison between DTM techniques owing to the wide range of corresponding design parameters. In order to handle the above discrepancies, we propose to use model checking, a state-space-based formal method, to model, evaluate, and compare DTM techniques across various functional and performance parameters. The suggested framework includes a modeling flow and a set of generic modules that realistically model many-core and DTM parameters like temperature, power, application, intercore communication and task migration, etc. For analysis purpose, the framework provides a common ground for comparing DTM techniques by formalizing DTM principles and performance parameters as a set of logical properties. These properties are verified for different task load configurations, e.g., multithreaded, malleable, and the applications which do not support migration. We analyze state-of-the-art central (c-) and distributed (d-) DTM techniques to demonstrate the generality and efficacy of our approach. Our formal analysis shows that the state-of-the-art cDTM technique performs better than dDTM in terms of achieving thermal stability, task migration, and communication overhead. We believe that conventional analysis methods do not facilitate such an exhaustive comparison among the DTM techniques.
Read full abstract