Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation

Shangding Gu,Yuhao Ding,Lu Wang,Ming Jin,Bilgehan Sel,Qingwei Lin,Alois Knoll

doi:10.1609/aaai.v38i19.30102

Abstract

Ensuring the safety of Reinforcement Learning (RL) is crucial for its deployment in real-world applications. Nevertheless, managing the trade-off between reward and safety during exploration presents a significant challenge. Improving reward performance through policy adjustments may adversely affect safety performance. In this study, we aim to address this conflicting relation by leveraging the theory of gradient manipulation. Initially, we analyze the conflict between reward and safety gradients. Subsequently, we tackle the balance between reward and safety optimization by proposing a soft switching policy optimization method, for which we provide convergence analysis. Based on our theoretical examination, we provide a safe RL framework to overcome the aforementioned challenge, and we develop a Safety-MuJoCo Benchmark to assess the performance of safe RL algorithms. Finally, we evaluate the effectiveness of our method on the Safety-MuJoCo Benchmark and a popular safe benchmark, Omnisafe. Experimental results demonstrate that our algorithms outperform several state-of-the-art baselines in terms of balancing reward and safety optimization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 1

Similar Papers

A Review of Safe Reinforcement Learning: Methods, Theories, and Applications.
Shangding Gu ... Alois Knoll
IEEE transactions on pattern analysis and machine intelligence | VOL. 46
Shangding Gu, et. al.Shangding Gu ... Alois Knoll
01 Dec 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. 46

Self-Preserving Genetic Algorithms for Safe Learning in Discrete Action Spaces
Preston K Robinette ... Nathaniel P Hamilton
-
Preston K Robinette, et. al.Preston K Robinette ... Nathaniel P Hamilton
09 May 2023
09 May 2023

DEMO: Self-Preserving Genetic Algorithms vs. Safe Reinforcement Learning in Discrete Action Spaces
Preston K Robinette ... Nathaniel P Hamilton
-
Preston K Robinette, et. al.Preston K Robinette ... Nathaniel P Hamilton
09 May 2023
09 May 2023

A comprehensive review on safe reinforcement learning for autonomous vehicle control in dynamic environments
Rohan Inamdar ... Nitish Katal
e-Prime - Advances in Electrical Engineering, Electronics and Energy | VOL. 10
Rohan Inamdar, et. al.Rohan Inamdar ... Nitish Katal
11 Oct 2024
e-Prime - Advances in Electrical Engineering, Electronics and Energy | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence