Reinforcement Learning-Based Routing Protocols for Vehicular Ad Hoc Networks: A Comparative Survey

Rezoan Ahmed Nazib,Sangman Moh

doi:10.1109/access.2021.3058388

Rezoan Ahmed Nazib, Sangman Moh

Open Access

https://doi.org/10.1109/access.2021.3058388

Copy DOI

Abstract

Vehicular-ad hoc networks (VANETs) hold great importance because of their potentials in road safety improvement, traffic monitoring, and in-vehicle infotainment services. Due to high mobility, sparse connectivity, road-side obstacles, and shortage of roadside units, the links between the vehicles are subject to frequent disconnections; consequently, routing is crucial. Recently, to achieve more efficient routing, reinforcement learning (RL)-based routing algorithms have been investigated. RL represents a class of artificial intelligence that implements a learning procedure based on previous experiences and provides a better solution for future operations. RL algorithms are more favorable than other optimization techniques owing to their modest usage of memory and computational resources. Because a VANET deals with passenger safety, any kind of flaw is intolerable in VANET routing. Fortunately, RL-based algorithms have the potentials to optimize the different quality-of-service parameters of VANET routing such as bandwidth, end-to-end delay, throughput, control overhead, and packet delivery ratio. However, to the best of the authors' knowledge, surveys on RL-based routing protocols for VANETs have not been conducted. To fulfill this gap in the literature and to provide future research directions, it is necessary to aggregate the scattered works on this topic. This study presents a comparative investigation of RL-based routing protocols, by considering their working procedure, advantages, disadvantages, and applications. They are qualitatively compared in terms of key features, characteristics, optimization criteria, performance evaluation techniques, and implemented RL techniques. Lastly, open issues and research challenges are discussed to make RL-based VANET routing protocols more efficient in the future.

Highlights

Vehicular ad hoc networks (VANETs) are among the most investigated topics in the field of mobile ad hoc networks (MANETs)
The reinforcement learning (RL) algorithm is the only branch of Machine learning (ML) wherein the efficiency of a certain system continues to increase with time
The most significant difference between RL algorithms and other AI algorithms is that with more experience, RL algorithms can continuously improve performance, whereas other paradigms are limited by the given information

Summary

INTRODUCTION

Vehicular ad hoc networks (VANETs) are among the most investigated topics in the field of mobile ad hoc networks (MANETs). ML algorithms are used to improve the performance of routing protocols for VANETs [17]. As there is no other survey done on the topic of ‘‘RL-based VANET routing algorithms,’’ we have not restricted the publication time of the researches. Critical analysis of the RL-based VANET routing protocols is presented by emphasizing their working procedure, advantages, disadvantages, and applications. Some major RL techniques include trust region policy optimization (TRPO) [41], proximal policy optimization (PPO) [42], Q-learning or value iteration method [43], state-action-reward-state-action (SARSA) [44], and deep Q network (DQN) [45] algorithm. There are mainly four types of RL algorithms’ variants used in the investigated RL-based VANET routing protocols They are the Q-learning algorithm, policy hill climbing, SARSA(λ) and deep RL (DRL) algorithm. Further discussion is given in the following subsections, describing the working methodology of the two algorithms

Q LEARNING

COMPARISON

RECOMMENDATIONS

OPEN RESEARCH ISSUES AND CHALLENGES

CONCLUSION