Utilizing Reinforcement Learning to Continuously Improve a Primitive-Based Motion Planner

Zachary C Goddard,Kyle Williams,Anirban Mazumdar,Julie Parish,Kenneth Wardlaw

doi:10.2514/1.i011044

Zachary C Goddard, Kyle Williams + Show 3 more

Open Access

https://doi.org/10.2514/1.i011044

Copy DOI

Abstract

This paper describes how the performance of motion primitive-based planning algorithms can be improved using reinforcement learning. Specifically, we describe and evaluate a framework that autonomously improves the performance of a primitive-based motion planner. The improvement process consists of three phases: exploration, extraction, and reward updates. This process can be iterated continuously to provide successive improvement. The exploration step generates new trajectories, and the extraction step identifies new primitives from these trajectories. These primitives are then used to update rewards for continued exploration. This framework required novel shaping rewards, development of a primitive extraction algorithm, and modification of the Hybrid A* algorithm. The framework is tested on a navigation task using a nonlinear F-16 model. The framework autonomously added 91 motion primitives to the primitive library and reduced average path cost by 21.6 s, or 35.75% of the original cost. The learned primitives are applied to an obstacle field navigation task, which was not used in training, and reduced path cost by 16.3 s, or 24.1%. Additionally, two heuristics for the modified Hybrid A* algorithm are designed to improve effective branching factor.

Full Text