Mean-Field-Aided Multiagent Reinforcement Learning for Resource Allocation in Vehicular Networks

Hengxi Zhang,Xiaoli Wei,Le Liang,Wenbo Ding,Chengyue Lu,Huaze Tang,Zhu Han,Ling Cheng

doi:10.1109/jiot.2022.3214525

Abstract

As one technique for autonomous driving, vehicular networks can achieve high efficiency with vehicle-and-infrastructure cooperation, bringing high safety and many value-added services. To achieve higher communication efficiency, much effort has been done to cope with the resource allocation issues for vehicular networks. Nevertheless, due to the strong nonconvexity and nonlinearity, the classical joint resource allocation problem in vehicular networks is typically NP-hard. The multiagent reinforcement learning (MARL) has emerged as a promising solution to tackle this challenge but its stability and scalability are not satisfactory when the amount of vehicles gets increased. In this article, we mainly investigate the issue of joint spectrum and power allocation in vehicular communication networks, and carefully consider the interactions between the vehicles and environment by incorporating the cooperative stochastic game theory with MARL, named complete-game MARL (CG-MARL), to achieve a better convergence and stability with the theoretical computational complexity <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathcal {O}(n^{N})$ </tex-math></inline-formula> with <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> denoting the dimension of action space and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$N$ </tex-math></inline-formula> denoting the number of V2X Vehicular. Furthermore, the mean-field game (MFG) theory is employed to further enhance the MARL for decreasing the horrible computing resource consumption caused by the CG-MARL to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathcal {O}(n^{2})$ </tex-math></inline-formula> while maintaining an approximate performance. The simulation results demonstrate that the proposed mean-field-aided MARL (MF-MARL) for vehicular network resource allocation can achieve 95% near-optimal performance with much lower complexity, which indicates its significant potentials in the scenarios with massive and dense vehicles.

Full Text