Deep Reinforcement Learning for Throughput Improvement of the Uplink Grant-Free NOMA System

Jiazhen Zhang,Huici Wu,Xiaofeng Tao,Xuefei Zhang,Ning Zhang

doi:10.1109/jiot.2020.2972274

Abstract

Facing the dramatic increase of mobile devices and the scarcity of spectrum resources, grant-free nonorthogonal multiple access (NOMA) emerges as an enabling technology for massive access, which also reduces signaling overhead and access latency effectively. However, in grant-free NOMA systems, the collisions resulting from uncoordinated resource selection can cause severe interference and reduce system throughput. In this article, we apply deep reinforcement learning (DRL) in the decision making for grant-free NOMA systems, to mitigate collisions and improve the system throughput in an unknown network environment. To reduce collisions in the frequency domain and the computational complexity of DRL, subchannel and device clustering are first designed, where a cluster of devices compete for a cluster of subchannels following grant-free NOMA. Furthermore, discrete uplink power control is proposed to reduce intracluster collisions. Then, the long-term cluster throughput maximization problem is formulated as a partially observable Markov decision process (POMDP). To address the POMDP, a DRL-based grant-free NOMA algorithm is proposed to learn about the network contention status and output subchannel and received power-level selection with less collisions. The numerical results verify the effectiveness of the proposed algorithm and reveal that DRL-based grant-free NOMA outperforms slotted ALOHA NOMA with 32.9% and 156% performance gain on the system throughput when the number of devices is twice and five times that of the subchannels, respectively. When the number of devices is five times that of the subchannels, the success access probability of DRL-based grant-free NOMA is above 85%, compared to 33% in the slotted ALOHA NOMA system.

Full Text