CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

Qisong Yang,Matthijs T.J. Spaan

doi:10.1609/aaai.v37i9.26281

Abstract

In the absence of assigned tasks, a learning agent typically seeks to explore its environment efficiently. However, the pursuit of exploration will bring more safety risks. An under-explored aspect of reinforcement learning is how to achieve safe efficient exploration when the task is unknown. In this paper, we propose a practical Constrained Entropy Maximization (CEM) algorithm to solve task-agnostic safe exploration problems, which naturally require a finite horizon and undiscounted constraints on safety costs. The CEM algorithm aims to learn a policy that maximizes state entropy under the premise of safety. To avoid approximating the state density in complex domains, CEM leverages a k-nearest neighbor entropy estimator to evaluate the efficiency of exploration. In terms of safety, CEM minimizes the safety costs, and adaptively trades off safety and exploration based on the current constraint satisfaction. The empirical analysis shows that CEM enables the acquisition of a safe exploration policy in complex environments, resulting in improved performance in both safety and sample efficiency for target tasks.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 1

Similar Papers

Entropy-maximization based adaptive frequency hopping for wireless medical telemetry systems
Kyung-Joon Park ... Christopher D Schmitz
-
Kyung-Joon Park, et. al.Kyung-Joon Park ... Christopher D Schmitz
18 May 2009
18 May 2009

Scattering of light by dielectric particles: statistical theory
Cynthia K Whitney
Journal of the Optical Society of America | VOL. 69
Cynthia K WhitneyCynthia K Whitney
01 Nov 1979
Journal of the Optical Society of America | VOL. 69

Learning Bayesian network parameters via minimax algorithm
Xiao-Guang Gao ... Chu-Chao He
International Journal of Approximate Reasoning | VOL. 108
Xiao-Guang Gao, et. al.Xiao-Guang Gao ... Chu-Chao He
08 Mar 2019
International Journal of Approximate Reasoning | VOL. 108

Health and safety (hs) risks normalization in the construction industry: the SMEs perspective
Adesoji Anthony Adegboyega ... Chidiebere Emmanuel Eze
Independent Journal of Management & Production | VOL. 12
Adesoji Anthony Adegboyega, et. al.Adesoji Anthony Adegboyega ... Chidiebere Emmanuel Eze
01 Aug 2021
Independent Journal of Management & Production | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CEM: Constrained Entropy Maximization for Task-Agnostic Safe Exploration

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence