Ship welding is a crucial part of ship building, requiring higher levels of robot coordination and working efficiency than ever before. To this end, this paper studies the coordinated ship-welding task, which involves multi-robot welding of multiple weld lines consisting of synchronous ones to be executed by a pair of robots and normal ones that can be executed by one robot. To evaluate working efficiency, the objectives of optimal lazy robot ratio and energy consumption were considered, which are tackled by the proposed dynamic Kuhn–Munkres-based model-free policy gradient (DKM-MFPG) reinforcement learning algorithm. In DKM-MFPG, a dynamic Kuhn–Munkres (DKM) dispatcher is designed based on weld line and co-welding robot position information obtained by the wireless sensors, such that robots always have dispatched weld lines in real-time and the lazy robot ratio is 0. Simultaneously, a model-free policy gradient (MFPG) based on reinforcement learning is designed to achieve the energy-optimal motion control for all robots. The optimal lazy robot ratio of the DKM dispatcher and the network convergence of MFPG are theoretically analyzed. Furthermore, the performance of DKM-MFPG is simulated with variant settings of welding scenarios and compared with baseline optimization methods. Compared to the four baselines, DKM-MFPG owns a slight performance advantage within 1% on energy consumption and reduces the average lazy robot ratio by 11.30%, 10.99%, 8.27%, and 10.39%.