Abstract

AbstractThis paper provides a novel safe reinforcement learning (RL) control algorithm to solve safe optimal problems for fully cooperative (FC) games of discrete‐time multiplayer nonlinear systems with state and input constraints. The FC game is a special case of nonzero‐sum (NZS) games, where all players cooperate to accomplish a common task. The algorithm is proposed based on the policy iteration (PI) framework utilizing only the measured data along the system trajectories in the environment. Different from most works about PI, an effective method of obtaining initial safe and stable control policies is given here. In addition, control barrier functions (CBFs) and an input constraint function are introduced to augment reward functions. And the monotonically nonincreasing property of the iterative value function in the PI algorithm maintains the safe set forward invariant. Then, the neural networks are employed to approximate the system dynamics, the iterative control policies, and the iterative value function, respectively. Furthermore, the proposed algorithm is supported by theoretical proofs that guarantee both safety and convergence. Finally, the effectiveness and safety of the algorithm are illustrated by the results of the simulation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.