Reinforcement learning on explicitly specified time scales

Ralf Schoknecht,Martin Riedmiller

doi:10.1007/s00521-003-0368-x

Abstract

In recent years hierarchical concepts of temporal abstraction have been integrated in the reinforcement learning framework to improve scalability. However, existing approaches are limited to domains where a decomposition into subtasks is known a priori. In this article we propose the concept of explicitly selecting time scale related abstract actions if no subgoal related abstract actions are available. This concept is realised with multi-step actions on different time scales that are combined in one single action set. We exploit the special structure of the action set in the MSA-Q-learning algorithm. This approach is suited for learning optimal policies in “unstructured” domains where a decomposition into subtasks is not known in advance or does not exist at all. By learning different explicitly specified time scales simultaneously, we achieve a considerable improvement of learning speed, which we demonstrate on several benchmark problems.

Full Text