Abstract

Conventional reinforcement learning (RL) typically determines an appropriate primitive action at each timestep. However, by using a proper macro action, defined as a sequence of primitive actions, an RL agent is able to bypass intermediate states to a farther state and facilitate its learning procedure. The problem we would like to investigate is what associated beneficial properties that macro actions may possess. In this article, we unveil the properties of reusability and transferability of macro actions. The first property, reusability , means that a macro action derived along with one RL method can be reused by another RL method for training, while the second one, transferability , indicates that a macro action can be utilized for training agents in similar environments with different reward settings. In our experiments, we first derive macro actions along with RL methods. We then provide a set of analyses to reveal the properties of reusability and transferability of the derived macro actions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call