Abstract
5G wireless networks are expected to satisfy different delay requirements of various traffics by network resource scheduling. Existing scheduling methods perform poorly in practice due to their unrealistic assumption on the access to the full channel state information (CSI) or the explicit mathematical expression of network delay. In this paper, we consider the delay-oriented packet scheduling problem in multi-cell 5G downlink networks with multiple users and traffic types (e.g., FTP, VoIP and video streaming), and formulate it as a partially observable Markov decision process (POMDP). We design a delay-oriented downlink scheduling framework based on deep reinforcement learning (DRL) to autonomously schedule the active traffic flows without the full channel information. Furthermore, a recurrent proximal policy optimization (RPPO) algorithm is proposed to perceive the underlying state and accelerate learning under different time granularities, with the policy gradient theorem under POMDP strictly proved. By incorporating the future traffic information provided by a proposed spatial-temporal prediction algorithm, RPPO can balance the load and achieve lower delay in real-time multi-cell multi-user scenarios. Results of extensive experiments on a realistic 5G simulator demonstrate that our framework significantly outperforms existing approaches in terms of both tail delay and average delay for up to 48% and 41.7%, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.