A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments

Sherif Abdelfattah,Jiankun Hu,Kathryn Kasmarik

doi:10.1177/1059712319869313

Sherif Abdelfattah, Jiankun Hu + Show 1 more

Open Access

https://doi.org/10.1177/1059712319869313

Copy DOI

Journal: Adaptive Behavior	Publication Date: Aug 15, 2019
Citations: 3

Affiliation: University of Canberra

Abstract

Multi-objective Markov decision processes are a special kind of multi-objective optimization problem that involves sequential decision making while satisfying the Markov property of stochastic processes. Multi-objective reinforcement learning methods address this kind of problem by fusing the reinforcement learning paradigm with multi-objective optimization techniques. One major drawback of these methods is the lack of adaptability to non-stationary dynamics in the environment. This is because they adopt optimization procedures that assume stationarity in order to evolve a coverage set of policies that can solve the problem. This article introduces a developmental optimization approach that can evolve the policy coverage set while exploring the preference space over the defined objectives in an online manner. We propose a novel multi-objective reinforcement learning algorithm that can robustly evolve a convex coverage set of policies in an online manner in non-stationary environments. We compare the proposed algorithm with two state-of-the-art multi-objective reinforcement learning algorithms in stationary and non-stationary environments. Results showed that the proposed algorithm significantly outperforms the existing algorithms in non-stationary environments while achieving comparable results in stationary environments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments

Abstract

Talk to us

Similar Papers

More From: Adaptive Behavior

Lead the way for us

Similar Papers

Learning Pareto Set for Multi-Objective Continuous Robot Control
Tianye Shu ... Hisao Ishibuchi
-
Tianye Shu, et. al.Tianye Shu ... Hisao Ishibuchi
01 Aug 2024
01 Aug 2024

"Notice of Violation of IEEE Publication Principles" Multiobjective Reinforcement Learning: A Comprehensive Overview.
Chunming Liu ... Dewen Hu
IEEE transactions on cybernetics | VOL. -
Chunming Liu, et. al.Chunming Liu ... Dewen Hu
29 Apr 2013
IEEE transactions on cybernetics | VOL. -

Integrating unmanned and manned UAVs data network based on combined Bayesian belief network and multi-objective reinforcement learning algorithm
Richard C Millar ... Armin Mahmoodi
Drone Systems and Applications | VOL. 11
Richard C Millar, et. al.Richard C Millar ... Armin Mahmoodi
01 Jan 2023
Drone Systems and Applications | VOL. 11

Decomposition Based Multi-Objective Evolutionary Algorithm in XCS for Multi-Objective Reinforcement Learning
Xiu Cheng ... Mengjie Zhang
-
Xiu Cheng, et. al.Xiu Cheng ... Mengjie Zhang
01 Jul 2018
01 Jul 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments

Abstract

Talk to us

Similar Papers

More From: Adaptive Behavior