Abstract

Model-free reinforcement learning methods have potentials in ship collision avoidance under unknown environments. To defect the low efficiency problem of the model-free reinforcement learning, a composite learning method is proposed based on an asynchronous advantage actor-critic (A3C) algorithm, a long short-term memory neural network (LSTM) and Q-learning. The proposed method uses Q-learning for adaptive decisions between a LSTM inverse model-based controller and the model-free A3C policy. Multi-ship collision avoidance simulations are conducted to verify the effectiveness of the model-free A3C method, the proposed inverse model-based method and the composite learning method. The simulation results indicate that the proposed composite learning based ship collision avoidance method outperforms the A3C learning method and a traditional optimization-based method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.