Abstract

This paper first formulates a novel long-term beam power allocation (BPA) problem to tackle the harmful co-linear interference issue in the geostationary earth orbit (GEO) and low earth orbit (LEO) co-existing satellite system. This BPA problem intends to optimize the long-term weighted sum rate of the LEO system while ensuring that GEO user’s received interference from the LEO satellite system is lower than a pre-fixed threshold. To solve it in a real-time manner, a deep reinforcement learning (DRL) framework based on the proximal policy optimization (PPO) algorithm is proposed, named as drlBPA. In addition, for the existing most relevant baseline, the fractional optimization (FO)-based BPA scheme, on the one hand, this paper improves it via a greedy strategy to fully exploit time resource. On the other hand, to further reduce the computational complexity stemming from its iterative solving procedure, a deep neural network approximation scheme is also developed. Simulation results demonstrate that (i) The trained DRL model of the proposed drlBPA scheme has good convergence and generality performance. (ii) Compared with the three FO-based benchmarks, the drlBPA scheme not only achieves the highest throughput of the LEO system within a significantly reduced computation time, but also yields the best system stability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call