As humans continue to exploit the ocean, the number of UAV nodes at sea and the demand for their services are increasing. Given the dynamic nature of marine environments, traditional resource allocation methods lead to inefficient service transmission and ping-pong effects. This study enhances the alignment between network resources and node services by introducing an attention mechanism and double deep Q-learning (DDQN) algorithm that optimizes the service-access strategy, curbs action outputs, and improves service-node compatibility, thereby constituting a novel method for UAV network resource allocation in marine environments. A selective suppression module minimizes the variability in action outputs, effectively mitigating the ping-pong effect, and an attention-aware module is designed to strengthen node-service compatibility, thereby significantly enhancing service transmission efficiency. Simulation results indicate that the proposed method boosts the number of completed services compared with the DDQN, soft actor–critic (SAC), and deep deterministic policy gradient (DDPG) algorithms and increases the total value of completed services.
Read full abstract