Objectives Naval ship systems such as the hull structure, weapons equipment and power equipment will deteriorate during their service life. Thus, a ship maintenance strategy based on the actual deterioration state is essential for ensuring the safety and availability of naval ships. Methods In this paper, a multi-state deterioration system model is established on the basis of the Markov decision process. A reinforcement learning mode is then introduced to train the agent that generates the maintenance strategy, and the optimal condition-based maintenance strategy is obtained in the process of adaptive learning. Results The proposed method is applied to a ship structural deterioration system for demonstration, and the results show that it can obtain the optimal maintenance policy for a multi-state deterioration system considering the actual conditions, thereby providing an intelligent supporting tool for decision-makers to formulate optimal ship maintenance strategies. Conclusions This paper shows that the reinforcement learning method has great potential in comprehensively improving ship maintenance support.