Power Control Based on Deep Reinforcement Learning for Spectrum Sharing

Haijun Zhang,Ning Yang,Keping Long,Victor C M Leung,Wei Huangfu

doi:10.1109/twc.2020.2981320

Abstract

In the current researches, artificial intelligence (AI) plays a crucial role in resource management for the next generation wireless communication network. However, traditional RL cannot solve the continuous and high dimensional problems. To handle these problems, the concept of deep neural network (DNN) is introduced into RL to solve high dimensional problems. In this paper, we first construct an information interaction model among primary user (PU), secondary user (SU) and wireless sensors in a cognitive radio system. In the model, the SU is unable to get the power allocation information of the PU, and needs to use the received signal strengths (RSSs) of the wireless sensors to adjust its own power. The PU allocates transmit power relying on its power control scheme. We propose an asynchronous advantage actor critic (A3C)-based power control of SU that is a parallel actor-learners framework with root mean square prop (RMSProp) optimization. Multiple SUs learn power control scheme simultaneously on different CPU threads, reducing neural network gradient update interdependence. To further improve the efficiency of spectrum sharing, the distributed proximal policy optimization (DPPO)-based power control is proposed which is an asynchronous variant of actor-critic with adaptive moment (Adam) optimization. It enables the network to converge quickly. After several power adjustments, the PU and the SU meet quality of service (QoS) requirements and achieve spectrum sharing.

Full Text