Abstract

Packet routing is a fundamental problem in wireless networks in which routers decide the next hop for each packet in order to deliver it to its destination as quickly as possible. In order to overcome the shortcoming of non optimal forwarding path caused by fixed forwarding mode in geographic location-based routing algorithm, we investigated a new efficient packet routing strategy which combined with Deep Reinforcement Learning (DRL) algorithm (Proximal Policy Optimization: PPO) to minimize the hops and reduce the probability of encountering “routing hole” during the forwarding of the packets in complex networks. Each node in our network can make routing decisions to learn a routing policy that can not only use the greedy mode to maintain the efficiency of routing policy but also reduce the use of peripheral forwarding mode. The learning process of the DRL agent is based on the status information of the packet being transmitted and information of neighbor nodes within the communication range. Extensive simulations over dynamic network topologies and the number of nodes have shown that our packets routing agent can learn the optimal policy in terms of average packets delivery rate and the number of hops compared with Greedy Perimeter Stateless Routing (GPSR) protocol. The performance of PPO in large-scale action space is also verified, which provides a basis for the future researches to combine PPO technique with packet routing optimization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call