Abstract

Flying ad hoc networks (FANETs), as the emerging communication paradigm, have been widely used in civil and military fields. Packet routing in FANETs is challenging due to dynamic network conditions. Traditional topology-based routing protocols are unsuitable for FANETs with dynamic network topologies. Routing protocols based on reinforcement learning (RL) may be the first choice for FANETs because of their good learning ability. However, existing RL-based routing protocols for FANETs have limited adaptability to network dynamics due to ignoring neighborhood environment states, and are prone to get stuck in suboptimal routing policies owing to inappropriate reward design and delayed reward issues. We propose AR-GAIL, an adaptive routing protocol based on Generative Adversarial Imitation Learning (GAIL), which aims to select the minimal end-to-end delay route according to ongoing network conditions for FANETs. We formulate the routing decision process as a Markov decision process (MDP) and design a novel MDP state which consists of the current node state and the neighborhood environment state. Moreover, we develop an efficient value function-based GAIL learning framework to learn the routing policy from expert routes instead of a predefined reward function. The simulation shows that AR-GAIL can adapt well to network dynamics. Compared with state-of-the-art routing protocols, AR-GAIL shows outstanding performance in terms of the end-to-end delay and packet delivery ratio.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call