Radio access network (RAN) slicing can significantly improve network flexibility and resource utilization efficiency. Generally, deep reinforcement learning (DRL) is a prevailing approach to implementing practicable resource management in RAN slicing. However, this demand-aware resource allocation suffers from long training time and slow execution efficiency. This can lead to the inability to quickly achieve the desired spectral efficiency and service level agreement satisfaction rate. To tackle this issue, we propose a novel RAN slicing resource allocation framework based on a deep hierarchical RL framework for efficient resource scheduling. Structurally, our framework consists of a policy selection network and a policy evaluation network. In particular, a newly built action acceleration unit can achieve quick reward accumulation, thus speeding up the optimal policy search. Extensive simulations show that our proposal has higher system utility and faster convergence compared to the state-of-the-art DRL algorithms.
Read full abstract