Priority-Objective Reinforcement Learning

Yusuf Al-Husaini,Matthias Rolf

doi:10.1109/icdl49984.2021.9515661

Abstract

Intelligent agents often have to cope with situations in which their various needs must be prioritised. Efforts have been made, in the fields of cognitive robotics and machine learning, to model need prioritization. Examples of existing frameworks include normative decision theory, the subsumption architecture and reinforcement learning. Reinforcement learning algorithms oriented towards active goal prioritization include the options framework from hierarchical reinforcement learning and the ranking approach as well as the MORE framework from multi-objective reinforcement learning. Previous approaches can be configured to make an agent function optimally in individual environments, but cannot effectively model dynamic and efficient goal selection behaviour in a generalisable framework. Here, we propose an altered version of the MORE framework that includes a threshold constant in order to guide the agent towards making economic decisions in a broad range of priority-objective reinforcement learning’ (PORL) scenarios. The results of our experiments indicate that pre-existing frameworks such as the standard linear scalarization, the ranking approach and the options framework are unable to induce opportunistic objective optimisation in a diverse set of environments. In particular, they display strong dependency on the exact choice of reward values at design time. However, the modified MORE framework appears to deliver adequate performance in all cases tested. From the results of this study, we conclude that employing MORE along with integrated thresholds, can effectively simulate opportunistic objective prioritization in a wide variety of contexts.

Full Text