Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN

Xiaoli He,He Xiao,Chunlin He,Yu Song,Hong Jiang

doi:10.1109/access.2019.2912996

Abstract

This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates $\alpha $ and discount factors $\gamma $ . Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.

Full Text