Double Deep Q-Network for Power Allocation in Cloud Radio Access Network

Amjad Iqbal,Yoong Choon Chang,Mau-Luen Tham

doi:10.1109/ccet50901.2020.9213138

Abstract

Cloud radio access network (CRAN) facilitates resource allocation (RA) by isolating remote radio heads (RRHs) from baseband units (BBUs). Traditional RA algorithms save energy by dynamically turning on/off RRHs and allocating power in each time slot. However, when the energy switching cost is considered, the decisions of turning on/off RRHs in adjacent time slots are correlated, which cannot be solved directly. Fortunately, deep reinforcement learning (DRL) can effectively model such problem, which motivates us to minimize the total power consumption subject to the constraints on per-RRH transmit power and user rates. Our starting point is the deep Q network (DQN), which is a combination of a neural network and Q-learning. In each time slot, DQN turns on /off a RRH yielding the largest Q-value (known as action value) prior to solving a power minimization problem for active RRHs. However, DQN yields Q-value overestimation issue, which stems from using the same network to choose the best action and to compute the target Qvalue of taking that action at the next state. To further increase the CRAN power savings, we propose a Double DQN-based framework by decoupling the action selection from the target Q-value generation. Simulation results indicate that the Double DQN-based RA method outperforms the DQN-based RA algorithm in terms of total power consumption.

Full Text