Abstract

Efficient energy distribution in smart grids is an important problem driven by the need to manage increasing power consumption across the globe. This problem has been studied in the past using different frameworks including Markov Decision Processes (MDP) and Reinforcement Learning. However, existing algorithms in Reinforcement Learning Theory largely deal with infinite horizon decision making. On the other hand, smart grid problems are inherently finite horizon in nature and so can best be addressed using Finite Horizon algorithms. We therefore analyze the smart grid setup using a finite horizon MDP model and develop, for the first time, a finite horizon version of the popular Q-learning algorithm. We observe that our algorithm shows good empirical performance on the smart grid setting. We also theoretically analyze the full convergence of the algorithm by proving both the stability and the convergence of the same. Our analysis of stability and convergence of finite horizon Q-learning is based entirely on the ordinary differential equations (O.D.E) method. Apart from smart grids, we additionally demonstrate the performance of our algorithm on a setting of random MDPs indicating that the algorithm is more generally applicable and can be studied on other settings in the future.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call