Abstract

We develop fast reinforcement learning (RL) framework using the approximated dynamics of a humanoid robot. Although RL is a useful non-linear optimizer, applying it to real robotic systems is usually difficult due to the large number of iterations required to acquire suitable policies. In this study, we approximate the dynamics using data from a real robot with sparse pseudo-input Gaussian processes (SPGPs). By using SPGPs, we estimated the probability distribution considering both the input vector and output signal variances. In real environments, since the observations from robotic sensors include large noise, SPGPs can suitably approximate the stochastic dynamics of a real humanoid robot. We use the approximated dynamics to improve the performance of a movement task in a path integral RL framework, which updates a policy from the sampled trajectories of the state and action vectors and the cost. We implemented our proposed method on a real humanoid robot and tested on a via-point reaching task. The robot achieved successful performance with fewer number of interactions with the real environment by using the proposed method than a conventional approach which dose not use the simulated dynamics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call