Abstract

In light of the increasing scarcity of frequency spectrum resources for satellite communication systems based on the transparent transponder, fast and efficient satellite resource allocation algorithms have become key to improving the overall resource occupancy. In this paper, we propose a reinforcement learning-based Multi-Branch Deep Q-Network (MBDQN), which introduces TL-Branch and RP-Branch to extract features of satellite resource pool state and task state simultaneously, and Value-Branch to calculate the action-value function. On the one hand, MBDQN improves the average resource occupancy performance (AOP) through the selection of multiple actions, including task selection and resource priority actions. On the other hand, the trained MBDQN is more suitable for online deployment and significantly reduces the runtime overhead due to the fact that MBDQN does not need iteration in the test phase. Experiments on both non-zero waste and zero waste datasets demonstrate that our proposed method achieves superior performance compared to the greedy or heuristic methods on the generated task datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call