Abstract

Timeout policy is an industry standard for dynamic power management (DPM), and thus is easy and safe to implement in many power-managed systems. The optimisation of timeout policy suffered from the lack of effective analytical model and fell in heuristic previously. This study presents an adaptive optimisation method for timeout DPM policy. First, a semi-Markov control processes model is introduced to formulate the DPM problem of finding timeout policies that minimise power consumption under performance constraints. Under this framework, the equivalence of timeout and stochastic policies on power-performance tradeoff is revealed, and the equivalent relation between these two types of DPM policy is derived. Then, a reinforcement learning algorithm that combines policy gradient estimate and stochastic approximation is proposed for optimising timeout policy online. This algorithm does not depend on any prior knowledge of system parameters, and can achieve a global optimum with less computational cost. Simulation results demonstrate the analytical results and the effectiveness of the proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call