Abstract
AbstractTo achieve autonomous anti‐jamming routing for wireless mesh networks (WMNs) under the threat of unknown removable jammers, a Q‐learning anti‐jamming routing algorithm using the improved upper‐confidence‐bound (UCB) algorithm is proposed in this paper. Specifically, the WMNs anti‐jamming routing path finding problem is first modelled as a Markov decision process. Then, to address the slow‐convergence anti‐jamming strategy induced by the drawback that the preset parameters uniquely determine the length of the traditional Q‐learning exploration process, a discounted‐UCB‐tuned algorithm (Dis‐UCB1‐tuned) is adopted as the decision algorithm to replace the ε‐greedy or softmax algorithm in traditional Q‐learning. Finally, the simulation results demonstrate the superiority of the proposed anti‐jamming scheme over the UCB‐based Q learning(UCBQ) algorithm, Dyna‐Q algorithm, and the traditional Q‐learning algorithm. To elaborate, the proposed scheme can fast converge to a superior anti‐jamming routing without any prior knowledge of jamming environment, and its performance is also independent with the preset parameter.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have