The use of cell-free massive multiple-input–multiple-output (MIMO) is regarded as a novel technique in the Industrial Internet of Things (IIoT) networks, and many studies have been reported on its cross-layer optimization, including random access and power allocation. Nevertheless, the cooperation of deep reinforcement learning (DRL) and cell-free massive lacks of deep study. In this article, a primal–dual deep deterministic policy gradient (DDPG) algorithm is designed to obtain cross-layer radio resource management, including power allocation in the physical layer and random access in the medium access layer. Different from the current studies, the random access and power allocation is formulated in cell-free massive MIMO IIoT networks, utilized by the stochastic ergodic optimization. In contrast to the stochastic policy gradient algorithm, a primal–dual DDPG algorithm is designed for the cross-layer optimization. Moreover, a multiagent primal–dual DDPG algorithm is proposed to different scenarios in the cell-free massive MIMO IIoT networks. Simulations are presented to verify the effectiveness of the primal–dual DDPG algorithm for random access and power allocation in the cell-free massive MIMO IIoT networks.
Read full abstract