Abstract
Runtime/design-time management decisions, such as mapping new application tasks/threads to cores, migrating tasks/threads among cores, scheduling tasks in individual cores, activating/deactivating cores, changing the Dynamic Voltage and Frequency Scaling (DVFS) levels, etc., are typically used by resource management techniques to optimize the usage of the available resources. Among the existing techniques in the literature, there are several power budgeting and thermal management techniques that are derived/formulated for the steady-state temperatures. Nevertheless, management decisions change the power consumption throughout the chip, and this can in turn result in transient temperatures which are much higher than any expected steady-state scenarios. If this occurs and the transient temperatures are higher than the critical threshold temperature, some Dynamic Thermal Management (DTM) technique would be activated on the chip to guarantee that it is not damaged. However, very frequent triggers of aggressive DTM techniques may degrade the overall performance of the system in an unpredictable manner (from the perspective of the resource management techniques). Most importantly, chips could also be seriously damaged if in some case the transient temperatures grow at a faster rate than the speed in which DTM can react to them. In order for the system to operate in thermally safe ranges and have a predictable behavior, resource management techniques could thus benefit from evaluating (i.e., estimating or predicting) such transient temperature peaks when making management decisions. In this chapter, we introduce a lightweight and accurate method for computing the peaks in transient temperatures at runtime. Our technique, called MatEx, is suitable for any compact thermal model that consist in a system of first-order differential equations, e.g., a thermal model based on RC thermal networks (like the one used by HotSpot). Most existing state-of-the-art techniques/tools for temperature computation/estimation/prediction use standard numerical methods to solve such a system of first-order differential equations. Although some of these techniques are reasonably efficient, they are not suitable to only compute the peaks in temperature during the transient state, and therefore these peaks must be extracted from extensive simulations for many time steps, taking sometimes several seconds to compute. Contrarily, MatEx is based on an analytical solution using matrix exponentials and linear algebra, that results in a mathematical expression which can be easily analyzed and differentiated in order to only compute the peaks in transient temperatures. Moreover, given that MatEx is based on an exact solution which is a function of time, it can also be used to efficiently compute any future transient temperatures without accuracy losses, making it able to potentially replace existing temperature estimation tools.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have