Abstract

To meet the explosive increasing demands for high data rates, hybrid beamforming has become a promising paradigm for massive MIMO millimeter wave (mmWave) systems. In this paper, we study the joint optimization of hybrid beamforming for maximizing the achievable sum rate in networked dynamic massive MIMO mmWave systems. To address this problem, we propose a novel Model-Assisted Decentralized Multi-Agent Reinforcement Learning (MAD-MARL) algorithm, which incorporates the following three designs: 1) a model-based prediction approach, which can improve the policy learning speed greatly and also predict each agent's potential future so as to alleviate the impact of delayed information sharing in a networked system, 2) an attentional prediction sharing approach, which is to effectively reduce the non-stationarity of the environment by combining the proposed model-based prediction and self-attention strategy for reaching global consensus and cooperative inter-agent coordination in a distributed manner, and 3) a decentralized model-free learning approach, which trains the agents with the assistance of model-based predictions. Extensive simulations are conducted and the numerical results demonstrate that MAD-MARL can significantly increase the learning speed and also greatly improve the overall performance compared with existing work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call