A Novel Model-Assisted Decentralized Multi-Agent Reinforcement Learning for Joint Optimization of Hybrid Beamforming in Massive MIMO mmWave Systems

Yan Yan,Jianjun Bai,Baoxian Zhang,Zheng Yao,Cheng Li

doi:10.1109/tvt.2023.3280910

Abstract

To meet the explosive increasing demands for high data rates, hybrid beamforming has become a promising paradigm for massive MIMO millimeter wave (mmWave) systems. In this paper, we study the joint optimization of hybrid beamforming for maximizing the achievable sum rate in networked dynamic massive MIMO mmWave systems. To address this problem, we propose a novel Model-Assisted Decentralized Multi-Agent Reinforcement Learning (MAD-MARL) algorithm, which incorporates the following three designs: 1) a model-based prediction approach, which can improve the policy learning speed greatly and also predict each agent's potential future so as to alleviate the impact of delayed information sharing in a networked system, 2) an attentional prediction sharing approach, which is to effectively reduce the non-stationarity of the environment by combining the proposed model-based prediction and self-attention strategy for reaching global consensus and cooperative inter-agent coordination in a distributed manner, and 3) a decentralized model-free learning approach, which trains the agents with the assistance of model-based predictions. Extensive simulations are conducted and the numerical results demonstrate that MAD-MARL can significantly increase the learning speed and also greatly improve the overall performance compared with existing work.

Full Text