Buffer-Aided Relay Selection for Cooperative Hybrid NOMA/OMA Networks With Asynchronous Deep Reinforcement Learning

Chong Huang,Gaojie Chen,Peng Xu,Zhu Han,Jonathon A Chambers,Yu Gong

doi:10.1109/jsac.2021.3087225

Abstract

This paper investigates asynchronous reinforcement learning algorithms for joint buffer-aided relay selection and power allocation in the non-orthogonal-multiple-access (NOMA) relay network. With the hybrid NOMA/OMA transmission, we investigate joint relay selection and power allocation to maximize the throughput with the delay constraint. To solve this complicated high-dimensional optimization problem, we propose two asynchronous reinforcement learning-based schemes: the asynchronous deep Q-Learning network (ADQN)-based scheme and the asynchronous advantage actor-critic (A3C)-based scheme, respectively. The A3C-based scheme achieves better performance and robustness when the action space is large, while the ADQN-based scheme converges faster with a small action space. Moreover, a-prior information is exploited to improve the convergence of the proposed schemes. The simulation results show that the proposed asynchronous learning-based schemes can learn from the environment and achieve good convergence.

Full Text