This paper proposes a novel adaptive dynamic programming (ADP) algorithm, named hybrid iteration (HI), to solve the cooperative, optimal output regulation problem (CO <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula> RP) for continuous-time, linear, multi-agent systems. Unlike traditional ADP algorithms, i.e., policy iteration (PI) and value iteration (VI), HI does not need an initial stabilizing control policy required by PI. At the same time, it maintains a faster convergence rate compared with VI. First, a model-based HI algorithm is proposed to solve the CO <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$^{2}$</tex-math></inline-formula> RP. Based on the proposed HI algorithm, a data-driven, adaptive, optimal controller is developed to solve the cooperative, adaptive, optimal output regulation problem without using any information about the physics of the system. Instead, the states/input information collected along the trajectories of the dynamic system is employed. The proposed data-driven HI is applied to the adaptive, optimal secondary voltage control (also known as voltage restoration control) of an islanded modern microgrid based on inverter-based resources. Compared with the VI and PI algorithms, comparative simulation results demonstrate that the proposed HI approach is significantly able to save the convergence time of the central processing unit (also known as CPU) deployed, reduce the number of learning iterations, and remove the requirement of the initial stabilizing control policy. Comparative experiments reveal the practicality and superiority of the proposed methodology.
Read full abstract