Abstract

The underwater unmanned vehicle (UUV) is widely used in various marine operations, in which path planning and trajectory tracking are the critical technologies to achieve autonomous motion planning. Unlike previous research methods, this article proposes the asynchronous multithreading proximal policy optimization-based path planning (AMPPO-PP) and trajectory tracking (AMPPO-TT) algorithms and applies these two methods to different task scenarios of UUVs. Taking advantage of the AMPPO, the expensive online computational procedure is converted to an offline training process. The proposed algorithms enable the UUV to learn autonomous planning, tracking, and emergency obstacle avoiding. Besides, the algorithm architecture of the AMPPO-PP and the AMPPO-TT is described in detail. By refining the reward in each timestep and utilizing the reward-shaping trick, the reward sparsity is avoided. The goal-distance heuristic reward function is used to make the UUV explore more directionally. Various simulation environments are developed from simple to complex, along with multiple comparative experiments to verify the effectiveness of the proposed algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.