A Q-Learning-Based Resource Allocation for Downlink Non-Orthogonal Multiple Access Systems Considering QoS

Qi Zhai,Miodrag Bolic,Wei Cheng,Chenxi Liu,Yong Li

doi:10.1109/access.2021.3080283

Abstract

As a technology that can accommodate more users and significantly improve spectral efficiency, non-orthogonal multiple access (NOMA) has attracted the attention of many scholars in recent years. The basic idea of NOMA is to implement multiple access in the power domain and decode the desired signal via successive interference cancellation (SIC). However, the resource allocation problem in such NOMA system is non-convex. It is difficult to directly solve this optimization problem through conventional methods. As such, we propose to apply a reinforcement learning (RL) approach based on cooperative Q-learning to solve the resource allocation problem in multi-antenna downlink NOMA systems. First, we formulate the resource allocation process as a sum rate maximization problem, subject to the power budget constraints and quality of service (QoS) condition. Second, we design a reward function to improve the sum rate while meeting the power and capacity constraints. Multiple Q-tables are created and cooperatively updated to get the optimal beamforming matrix. Then, we analyze the convergence of our proposed RL based power allocation method. Our simulations show that the proposed power allocation scheme yields excellent performance in terms of sum rate, energy efficiency, and spectral efficiency.

Highlights

The development of mobile internet and internet of things (IoT) has put forward challenging requirements for the fifth-generation wireless communication system (5G), which is expected to achieve higher spectral efficiency and lower latency [1], [2]
In this paper we consider a downlink multiple-input singleoutput (MISO) non-orthogonal multiple access (NOMA) communication scenario consisting of one base station and N users, where all the users are randomly distributed at different distances from the BS and equipped with single antenna
Initialization: create Q-tables for each antenna; set the total iterations Itot and error threshold ζ ; define state space and action space. while error ≥ ζ and episodes ≤ Itot do initial state s; for all steps of episode do for all Q-tables do choose a from the action space based on -greedy algorithm; end perform action a and calculate the sum rate Csum of the system; for all Q-tables do measure the reward R and new state s ; update Q(s, a) according to the updating rule; let s = s ; end end end the SRMax method (SRPA) proposed in [33] as benchmarks

Summary

INTRODUCTION

The development of mobile internet and internet of things (IoT) has put forward challenging requirements for the fifth-generation wireless communication system (5G), which is expected to achieve higher spectral efficiency and lower latency [1], [2]. B. RELATED WORKS At present, the investigation on resource allocation in NOMA communication system has made certain achievements in many aspects, such as user pairing [15], [16], channel assignment, and power allocation [17], [18]. The problem of user association and channel assignment in downlink multi-cell NOMA networks is solved in [19], the authors propose a low-complexity iterative solution to obtain the optimal power allocation, while accounting for inter-user interference and maintaining QoS per user. In [20], authors predefine some power allocation schemes and investigate the user pairing problem in NOMA uplink communication system. To the best of our knowledge, there is currently no research on solving the power allocation problem of multiantenna NOMA system through cooperative Q-learning algorithm. Unlike that in single-antenna NOMA systems, channel gains in the MISO NOMA systems are jointly determined by multiple antennas between the BS and users

SYSTEM MODEL

ALGORITHM DESCRIPTION

SIMULATION RESULTS

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Q-Learning-Based Resource Allocation for Downlink Non-Orthogonal Multiple Access Systems Considering QoS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Outage Balancing in Downlink Non-Orthogonal Multiple Access With Statistical Channel State Information
Sulong Shi ... Hongbo Zhu
IEEE Transactions on Wireless Communications | VOL. 15
Sulong Shi, et. al.Sulong Shi ... Hongbo Zhu
01 Jan 2015
IEEE Transactions on Wireless Communications | VOL. 15

Multichannel Resource Allocation for Downlink Non-Orthogonal Multiple Access Systems
Jianyue Zhu ... Jiaheng Wang
-
Jianyue Zhu, et. al.Jianyue Zhu ... Jiaheng Wang
01 Dec 2017
01 Dec 2017

Performance analysis at far and near user in NOMA based system in presence of SIC error
Monika Jain ... Divyang Rawal
AEU - International Journal of Electronics and Communications | VOL. 114
Monika Jain, et. al.Monika Jain ... Divyang Rawal
17 Nov 2019
AEU - International Journal of Electronics and Communications | VOL. 114

On Optimal Power Allocation for Downlink Non-Orthogonal Multiple Access Systems
Jianyue Zhu ... Yongming Huang
IEEE Journal on Selected Areas in Communications | VOL. 35
Jianyue Zhu, et. al.Jianyue Zhu ... Yongming Huang
01 Jan 2017
IEEE Journal on Selected Areas in Communications | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Q-Learning-Based Resource Allocation for Downlink Non-Orthogonal Multiple Access Systems Considering QoS

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access