Abstract

This paper presents a safe off-policy reinforcement learning (RL) scheme to design optimal controllers for systems with uncertain dynamics. The utility function for which its optimization achieves a desired behavior is augmented with a control barrier function (CBF) candidate providing a platform for merging safety planning and optimal control design. A damping factor is included in the CBF providing a design tool to specify the relative importance of performance and safety. As one of the main contributions of this paper, it is shown that by iterative approximation of the value function, the safety properties of CBF are certified which bridges the broad capability of barrier functions into the learning-based approaches. Then, the safety of control system is proved accordingly. Stability and optimality of the control system in a safe condition are verified. Afterward, an off-policy RL algorithm is used to obtain the safe and optimal controller without requiring full knowledge about the system dynamics. The efficiency of the proposed method is demonstrated on the lane changing as an automotive control problem.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call