Abstract

In this letter, we study the resource allocation problem in a multiuser multi-antenna system, in which the energy supply of the transmitter consists of the grid energy and harvested energy. Our objective is to maximize the long-term sum throughput under the constraint of energy supply by optimizing beamforming vectors and energy allocation. Considering the challenges of imperfect channel state information (CSI) and large action/state spaces, we propose a dimension reduction deep reinforcement learning (RL) method to solve the optimization problem. In the proposed algorithm, the beamforming vectors are first determined based on imperfect CSI, and then policy-based RL is employed to find the optimal mapping between transmit powers and the low dimensional system state. Simulation results demonstrate the superiority of the proposed algorithm over traditional ones in terms of steady-state performance and learning speed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call