Abstract

This paper addresses the problem of distributed dynamic spectrum access in a cognitive radio (CR) environment utilizing deep recurrent reinforcement learning. Specifically, the network consists of multiple primary users (PU) transmitting intermittently in their respective channels, while the secondary users (SU) attempt to access the channels when PUs are not transmitting. The problem is challenging considering the decentralized nature of CR network where each SU attempts to access a vacant channel, without coordination with other SUs, which result in collision and throughput loss. To address this issue, a multi-agent environment is considered where each of the SUs perform independent reinforcement learning to learn the appropriate policy to transmit opportunistically so as to minimize collisions with other users. In this article, we propose two long short-term memory (LSTM) based deep recurrent Q-network (DRQN) architectures for exploiting the temporal correlation in the transmissions by various nodes in the network. Furthermore, we investigate the effect of the architecture on success rate with varying number of users in the network and partial channel observations. Simulation results are compared with other existing reinforcement learning based techniques to establish the superiority of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call