Neuroevolutionary diversity policy search for multi-objective reinforcement learning

Dan Zhou,Jiqing Du,Sachiyo Arai

doi:10.1016/j.ins.2023.119932

Dan Zhou, Jiqing Du + Show 1 more

https://doi.org/10.1016/j.ins.2023.119932

Copy DOI

Export

Save

Cite

Journal: Information Sciences	Publication Date: Nov 24, 2023
Citations: 2

Affiliation: Chiba University

Abstract
Full-Text
Similar Papers

Abstract

Listen

Sequential decision-making requires balancing multiple conflicting objectives through multi-objective reinforcement learning (MORL). Moreover, decision-makers desire dense solutions that satisfy their requirements and consider the trade-offs between different objectives (Pareto optimal solutions). Most deep reinforcement learning methods focus on single-objective problems or solve multi-objective problems using simple linear combinations, which may oversimplify the underlying problem and lead to suboptimal results. This study proposes a neuroevolutionary diversity policy search approach to address MORL problems. It employs neural networks, each equipped with a buffer for storing recent experiences, representing individuals in a population. The non-dominated sorting method and diversity distance metric are employed in the evolutionary process to select high-quality solutions as teachers. The teachers use gradient-based genetic operators to guide the population to produce high-quality offspring, thereby achieving dense Pareto optimal solutions. Furthermore, we introduce three MORL benchmarks with distinct characteristics: (1) a continuous deep sea treasure with convex and nonconvex Pareto fronts; (2) a multi-objective mountain car with sparse rewards and a discontinuous Pareto front; and (3) a multi-objective HalfCheetah with high-dimensional action-state spaces. The experimental results on the three MORL benchmarks demonstrate the superiority of the proposed algorithm in obtaining dense and high-quality Pareto optimal solutions.

Full Text