Heuristic dynamic programming for mobile robot path planning based on Dyna approach

Seaar Al Dabooni,Donald Wunsch

doi:10.1109/ijcnn.2016.7727679

Abstract

This paper presents a direct heuristic dynamic programming (HDP) based on Dyna planning (Dyna_HDP) for online model learning in a Markov decision process. This novel technique is composed of HDP policy learning to construct the Dyna agent for speeding up the learning time. We evaluate Dyna_HDP on a differential-drive wheeled mobile robot navigation problem in a 2D maze. The simulation is introduced to compare Dyna_HDP with other traditional reinforcement learning algorithms, namely one step Q-learning, Sarsa (λ), and Dyna_Q, under the same benchmark conditions. We demonstrate that Dyna_HDP has a faster near-optimal path than other algorithms, with high stability. In addition, we also confirm that the Dyna_HDP method can be applied in a multi-robot path planning problem. The virtual common environment model is learned from sharing the robots' experiences which significantly reduces the learning time.

Full Text