Reinforcement learning-based dynamic band and channel selection in cognitive radio ad-hoc networks

Sung-Jeen Jang,Chul-Hee Han,Sang-Jo Yoo,Kwang-Eog Lee

doi:10.1186/s13638-019-1433-1

Abstract

In cognitive radio (CR) ad-hoc network, the characteristics of the frequency resources that vary with the time and geographical location need to be considered in order to efficiently use them. Environmental statistics, such as an available transmission opportunity and data rate for each channel, and the system requirements, specifically the desired data rate, can also change with the time and location. In multi-band operation, the primary time activity characteristics and the usable frequency bandwidth are different for each band. In this paper, we propose a Q-learning-based dynamic optimal band and channel selection by considering the surrounding wireless environments and system demands in order to maximize the available transmission time and capacity at the given time and geographic area. Through experiments, we can confirm that the system dynamically chooses a band and channel suitable for the required data rate and operates properly according to the desired system performance.

Highlights

As the demand for multimedia services increases, the problem of the frequency shortage continues to increase
The cognitive radio (CR) technologies provide an opportunity for secondary users (SUs) to use spectrums that are not used by primary users (PUs), allowing the SUs to access the spectrum by adjusting their operational parameters [4, 5]
6 Conclusions In this paper, we propose a band group and channel selection method considering the consecutive channel operation time, data transmission rate, channel utilization efficiency, and cost of the band group change for a cognitive radio ad-hoc network composed of cluster head (CH) and member node (MN)

Summary

Introduction

As the demand for multimedia services increases, the problem of the frequency shortage continues to increase. We propose an optimal band and channel selection mechanism in the cognitive radio ad-hoc network using the reinforcement learning. Using the proposed Q-learning, the CH can select an optimal band and channel that can maximize the multi-objective function of the CR network, and it can increase the coexistence efficiency of the overall secondary systems. ●We propose a new CR system architecture that maximizes the secondary user’s service quality by dynamically selecting the optimal operating band and channel with consideration of the traffic demand of each CR system and the channel statistics according to the primary systems;. ● We design a reward function that maximizes operating time, data rate, and channel utilization efficiency and minimizes band change overhead for secondary systems;. We define a reward using weighted sums for various budgets as well as a Q-learning algorithm that can operate according to the change in weights

Network model

Reinforcement learning for dynamic band and channel selection

Conclusions