Abstract
This article investigates autonomous resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. To model the uncertainty of environments, we formulate the long-term resource allocation problem as a stochastic game, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Furthermore, we propose a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have