Power Consumption Optimization Using Gradient Boosting Aided Deep Q-Network in C-RANs

Yifan Luo,Kezhi Wang,Marco Di Renzo,Wei Xu,Jiawei Yang

doi:10.1109/access.2020.2978935

Abstract

Cloud Radio Access Networks (C-RANs) have the potential to enable growing data traffic in 5G networks. However, with the complex states, resource allocation in C-RANs is time-consuming and computational-expensive, making it challenging to meet the demands of energy efficiency and low latency in real-time wireless applications. In this paper, we propose a gradient boosting decision tree (GBDT)-based deep Q-network (DQN) framework for solving the dynamic resource allocation (DRA) problem in a real-time C-RAN, where the heavy computation to solve SOCP problems is cut down and significant power consumption can be saved. First, we apply the GBDT to the regression task to approximate the solutions of second order cone programming (SOCP) problem formulated from beamforming design which consumes heavy computing resources by traditional algorithms. Then, we design a deep Q-network (DQN), coming from a common deep reinforcement learning, to autonomously generate the robust policy that controls the status of remote radio heads (RRHs) and saves the power consumption in long term. The DQN deploys deep neural networks (DNN) to solve the problem of innumerable states in the real-time C-RAN system and generates the policy by observing the state and the reward engendered by GBDT. The generated policy is error-tolerant considering that the gradient boosting regression may not be strictly subject to the constraints of the original problem. Simulation results validate its advantages in terms of the performance and computational complexity for power consumption saving compared with existing methods.

Highlights

The requirement and development of mobile data services have been continuously rising, with the increasing number of wireless devices like smart phones and tablets [1]
We propose a novel deep Q-network (DQN) based architecture, where the immediate reward is obtained from GBDTbased regressor instead of second order cone programming (SOCP) solutions, to generate the optimal policy to control the states of remote radio heads (RRHs)
We first employed the gradient boosting decision tree (GBDT) to approximate the solutions of the SOCP problem

Summary

Introduction

The requirement and development of mobile data services have been continuously rising, with the increasing number of wireless devices like smart phones and tablets [1]. Cloud radio access networks (C-RANs) [2] are regarded as the promising mobile network architecture to meet the above challenge. In a C-RAN, BBU can be placed in. A convenient and accessible place, and RRHs can be deployed up on poles or rooftops on demand. In order to get the optimal allocation strategy, several works have tried to apply convex optimizations, like second order cone programming (SOCP) in [3], semi-definite programming (SDP) in [4] and mix-integer programming (MIP) in [5]. In real-time C-RANs where the environment keeps changing, the efficiency of the above methods in finding the optimal decision faces great challenges. Attempts have been made in reinforcement learning (RL) to increase the efficiency of the solution procedure in [3], [6]

Objectives

Results

Conclusion