Abstract
In this letter, a deep Q-learning network (DQN) based resource allocation (RA) scheme is proposed for the massive multiple-input multiple-output (MIMO)- nonorthogonal multiple access (NOMA) systems. The reinforcement learning (RL) frame is developed to build an iterative optimization structure for user clustering, power allocation and beamforming. Specifically, a DQN is designed to group the users based on the reward item calculated after power allocation and beamforming. The objective is to maximize the reward item, i.e., the system throughput. Then, a back propagation neural network (BPNN) is used to realize the power allocation. During the training of BPNN, the exhaustive search results in the quantized power set are taken as the output labels. Simulation experiments show that the proposed scheme can achieve high system spectrum efficiency approximating to the exhaustive search based on user clustering and power allocation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.