Abstract

This paper investigates the multi-channel access (MCA) problem for vehicular networks in presence of illegal operators and users. Due to the limited spectrum resource, vehicle-to-vehicle (V2V) links need to reuse the frequency spectrum allocated for vehicle-to-infrastructure (V2I) links in networks. For high-mobility environments of vehicular networks, the fast-varying channel conditions will inevitably lead to the significant uncertainty of the acquired channel state information (CSI). Hence, the traditional centralized MCA methods, which rely on the CSI, could not be performed in a timely manner. We formulate the MCA of the vehicle users as a distributed optimization problem, by maximizing the sum capacity of V2I links while guaranteeing the reliability of V2V links as well as reducing the cost due to frequency hopping. A Multi-Agent Deep Deterministic Policy Gradient (MADDPG) framework is introduced to tackle this problem, where each vehicle user, connecting with either a V2I or V2V link, acts as a learning agent to make spectrum access decisions rapidly and locally to meet the latency requirement. In order to adapt to the dynamic environment, we further propose a delayed interaction aided MADDPG algorithm, which enables online training for the distributed MCA problem. Moreover, to guarantee the proposed algorithm to converge stably, we optimize the selecting weights of experience traces and propose a Batch Prioritized Experience Replay (BPER) strategy to ensure that the high priority traces can be learned timely. Compared with baselines, simulation results showcase remarkable performance gain of our proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call