Abstract

For autonomous robots, we propose an approximate model-based Bayesian reinforcement learning (MB-BRL) approach that reduces real-world samples within feasible computational efforts. Firstly, to find an approximate solution of an original undiscounted infinite horizon MB-BRL problem with a cost-free termination, we consider a finite horizon (FH) MB-BRL problem in which terminal costs are given by robust control policies. The resulting performance is better than or equal to the performance obtained with a robust method, while the resulting policy may choose an explorative behavior to get useful information about parametric model uncertainty for reducing real-world samples. Secondly, to obtain a feasible solution of the FH MB-BRL problem using simulation samples, we propose a combination of robust RL, Monte Carlo tree search (MCTS), and Bayesian inference. We show an idea of reusing previous MCTS samples for Bayesian inference at a leaf node. The proposed approach allows an agent to choose from multiple robust policies at a leaf node. Numerical experiments of a two-dimensional peg-in-hole task demonstrate the effectiveness of the proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call