Abstract
Intelligent resource allocation and power control schemes are regarded as important methods to alleviate the problems caused by the sharp increase in the number of users and operating costs. In this paper, we propose a multi-agent deep reinforcement learning (MADRL)-based algorithm to jointly optimize resource block (RB) allocation and power control, which aims to maximize the average spectrum efficiency (SE) of the system while meeting quality of service (QoS) constraints. In view of the fact that centralized training distributed execution retains the advantages of centralized training while reducing the amount of computation and signaling overhead, the MADRL technique can be adopted. In the proposed MADRL model, the Q function of each agent is aggregated through the value decomposition network, which strengthens the cooperation of agents and improves the convergence of the algorithm. We add a reward discount network into the original MADRL framework to adaptively adjust the attention to future rewards according to the performance of agents in the training process. Simulation experiments show that the proposed algorithm has better performance and stability than the existing alternatives.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have