Abstract

Reinforcement learning is a useful tool for complex control problems that cannot be modeled mathematically nor solved theoretically. Direct policy search(DPS) is an approach for reinforcement learning that represents a policy using some model and searches an optimal parameter directly by optimization techniques such as genetic algorithms(GA). Instance-based policy is a policy representation model of DPS. It represents a policy using a set of instances that are pairs of state and action. In this paper, we presents a real-coded GA to optimize efficiently a set of instances with continuous state and continuous action, given an episodic task. The proposed method named FLIP(Functional Learner for Instance-based Policy) was applied to a space robot and a car-like robot. The results of experiments show effectiveness and usefulness of FLIP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call