The space–air–ground integrated network can provide services to ground users in remote areas by utilizing high-altitude platform (HAP) drones to support stable user access and using low earth orbit (LEO) satellites to provide large-scale traffic backhaul. However, the rapid movement of LEO satellites requires dynamic maintenance of the matching relationship between LEO satellites and HAP drones. Additionally, different traffic types generated at HAP drones hold varying levels of values. Therefore, a tripartite matching problem among LEO satellites, HAP drones, and traffic types jointly considering multi-dimensional characteristics such as remaining visible time, channel condition, handover latency, and traffic storage capacity is formulated as mixed integer nonlinear programming to maximize the average transmitted traffic value. The traffic generation state for HAP drones is modeled as a mixture of stochasticity and determinism, which aligns with real-world scenarios, posing challenges for traditional optimization solvers. Thus, the original problem is decoupled into two independent sub-problems: traffic–drone matching and LEO–drone matching, which are addressed by mathematical simplification and multi-agent deep reinforcement learning with centralized training and decentralized execution, respectively. Simulation results verify the effectiveness and superiority of the proposed tripartite matching approach.
Read full abstract