Planning for potential: efficient safe reinforcement learning

Floris Den Hengst,Frank Van Harmelen,Mark Hoogendoorn,Vincent François-Lavet

doi:10.1007/s10994-022-06143-6

Floris Den Hengst, Frank Van Harmelen + Show 2 more

https://doi.org/10.1007/s10994-022-06143-6

Copy DOI

Journal: Machine Learning	Publication Date: Mar 23, 2022
Citations: 1	License type: open-access

Affiliation: ING Bank, Vrije Universiteit Amsterdam

Abstract

Deep reinforcement learning (DRL) has shown remarkable success in artificial domains and in some real-world applications. However, substantial challenges remain such as learning efficiently under safety constraints. Adherence to safety constraints is a hard requirement in many high-impact application domains such as healthcare and finance. These constraints are preferably represented symbolically to ensure clear semantics at a suitable level of abstraction. Existing approaches to safe DRL assume that being unsafe leads to low rewards. We show that this is a special case of symbolically constrained RL and analyze a generic setting in which total reward and being safe may or may not be correlated. We analyze the impact of symbolic constraints and identify a connection between expected future reward and distance towards a goal in an automaton representation of the constraints. We use this connection in an algorithm for learning complex behaviors safely and efficiently. This algorithm relies on symbolic reasoning over safety constraints to improve the efficiency of a subsymbolic learner with a symbolically obtained measure of progress. We measure sample efficiency on a grid world and a conversational product recommender with real-world constraints. The so-called Planning for Potential algorithm converges quickly and significantly outperforms all baselines. Specifically, we find that symbolic reasoning is necessary for safety during and after learning and can be effectively used to guide a neural learner towards promising areas of the solution space. We conclude that RL can be applied both safely and efficiently when combined with symbolic reasoning.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Planning for potential: efficient safe reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Similar Papers

Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle
Qilei Zhang ... Qixin Sha
IEEE Access | VOL. 8
Qilei Zhang, et. al.Qilei Zhang ... Qixin Sha
01 Jan 2020
IEEE Access | VOL. 8

Sample effficient deep reinforcement learning for control

-

15 Dec 2019
15 Dec 2019

Sample Efficient Deep Reinforcement Learning With Online State Abstraction and Causal Transformer Model Prediction.
Yixing Lan ... Qiang Fang
IEEE transactions on neural networks and learning systems | VOL. PP
Yixing Lan, et. al.Yixing Lan ... Qiang Fang
01 Jan 2024
IEEE transactions on neural networks and learning systems | VOL. PP

Deep Reinforcement Learning
Aske Plaat
-
Aske PlaatAske Plaat
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Planning for potential: efficient safe reinforcement learning

Abstract

Talk to us

Similar Papers

More From: Machine Learning