Evolutionary reinforcement learning system with time-varying parameters

Kosuke Umesako,Masanao Obayashi,Kunikazu Kobayashi

doi:10.1002/eej.20170

Abstract

In this paper, an evolutionary reinforcement learning system with time-varying parameters that can learn appropriate policy in dynamical POMDPs is proposed. The proposed system has time-varying parameters that can be adjusted by using reinforcement learning. Hence, the system can adapt to the time variation of the dynamical environment even if its variation cannot be observed. In addition, the state space of the environment is divided evolutionarily. Thus, one need not divide the state space in advance. The efficacy of the proposed system is shown by mobile robot control simulation under the environment belonging to dynamical POMDPs. The environment is the passage that has gates iterate opening and closing. © 2006 Wiley Periodicals, Inc. Electr Eng Jpn, 156(1): 54–60, 2006; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/eej.20170

Full Text