Abstract

We consider a heterogeneous network (HetNet), where multiple access points (APs) of potentially different transmission capacities serve users simultaneously via beamforming in the same spectrum band. We propose a beamforming framework that exploits multi-agent deep reinforcement learning (DRL) for the HetNet to maximize the system downlink sum-rate. In our framework, each AP acts as an agent, which is equipped with an online policy deep neural network (DNN) and an online Q-function DNN. The former generates an AP’s beamforming vector based only on local observations in a time slot, while the latter evaluates the appropriateness of this beamforming vector. We present a distributed-updating-centralized-rewarding scheme to train the policy DNNs and Q-function DNNs of all the APs in an online trial-and-error way. Under this scheme, all the APs take the system downlink sum-rate in a recent time slot (informed by a central controller) as their identical one-step reward. Trained by the experience items with centralized rewards in every time slot, the weight vectors of each AP’s local DNNs will be updated in the direction to the global optimum. Simulation results demonstrate that the proposed framework converges fast and outperforms the benchmark beamforming methods in terms of the system downlink sum-rate performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.