Abstract

This work applies the decomposition principle to discrete-time reinforcement learning to solve the optimal control problems for a network of subsystems. The control design is defined as a linear quadratic regulator graphical problem, where the performance function couples subsystems' dynamics. We first present a model-free discrete-time reinforcement learning algorithm based on online behaviors without using system dynamics. This could become a prohibitively-long learning process for larger networks. To remedy this issue, we develop an efficient model-free reinforcement learning algorithm based on dynamic mode decomposition. This decomposition method reduces the size of the measured data while the dynamic information of the original network is still retained. This algorithm is then implemented online. The proposed methodology is validated using examples of a consensus network and a power system network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call