Optimization of Instance-based Policy Based on Real-coded Genetic Algorithms

Atsushi Miyamae ,Isao Ono,Jun Sakuma,Shígeo Kobayashi

doi:10.1109/smcia.2008.5045986

Abstract

Reinforcement learning is a useful tool for complex control problems that cannot be modeled mathematically nor solved theoretically. Direct policy search(DPS) is an approach for reinforcement learning that represents a policy using some model and searches an optimal parameter directly by optimization techniques such as genetic algorithms(GA). Instance-based policy is a policy representation model of DPS. It represents a policy using a set of instances that are pairs of state and action. In this paper, we presents a real-coded GA to optimize efficiently a set of instances with continuous state and continuous action, given an episodic task. The proposed method named FLIP(Functional Learner for Instance-based Policy) was applied to a space robot and a car-like robot. The results of experiments show effectiveness and usefulness of FLIP.

Full Text