Abstract
Reinforcement learning has been widely used for unmanned surface vehicle (USV) control tasks. However, the requirement of numerous training samples limits its transferability to new USVs. In this article, we propose a dynamic estimation meta reinforcement learning (DEMRL) approach that enables few-shot learning for the path following control policy. We first present a dynamic estimation method to learn a latent dynamic context feature. The learned context contains the hidden information of USV dynamics with only a few estimation samples. We then propose a meta reinforcement learning based training framework to learn the generalizable path following control policy. After that, given the prior knowledge from dynamic context, the well-trained policy can easily adapt to the target USV during the rapid adaptation process. This proposed method represents the initial effort in tackling the few-shot learning challenge associated with training reinforcement learning based USV path-following policies. Extensive experiments demonstrate that the proposed method can achieve promising path following performance for unseen USV with very few training data and training volume.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.