Abstract

In this paper we propose a novel Bayesian network based model for analysing convergence properties of reinforcement learning (RL) based dynamic spectrum access (DSA) algorithms. It uses a minimum complexity DSA problem for probabilistic analysis of the joint policy transitions of RL algorithms. A Monte Carlo simulation of a distributed Q-learning DSA algorithm shows that the proposed approach exhibits remarkable accuracy of predicting convergence behaviour of such algorithms. Furthermore, their behaviour can also be expressed in the form of an absorbing Markov chain, derived from the novel Bayesian network model. This representation enables further theoretical analysis of convergence properties of RL based DSA algorithms. The main benefit of the analysis tool presented in this paper is that it enables the design and theoretical evaluation of novel DSA schemes by extending the proposed Bayesian network model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.