The explosive growth of Internet-of-Things (IoT) applications such as smart cities and Industry 4.0 have led to drastic increase in demand for wireless bandwidth, hence motivating the rapid development of new techniques for enhancing spectrum utilization needed by new generation wireless communication technologies. Among others, dynamic spectrum access (DSA) is one of the most widely accepted approaches. In this paper, as an enhancement of existing works, we take into consideration of inter-node collaborations in a dynamic spectrum environment. Typically, in such distributed circumstances, intelligent dynamic spectrum access almost invariably relies on self-learning to achieve dynamic spectrum access improvement. Whereas, this paper proposes a DSA scheme based on deep reinforcement learning to enhance spectrum and access efficiency. Unlike traditional Q-learning-based DSA, we introduce the following to enhance the spectrum efficiency in dynamic IoT spectrum environments. First, deep double Q-learning is adopted to perform local self-spectrum-learning for IoT terminals in order to achieve better dynamic access accuracy. Second, to accelerate learning convergence, federated learning (FL) in edge nodes is used to improve the self-learning. Third, multiple secondary users, who do not interfere with each other and have similar operation condition, are clustered for federated learning to enhance the efficiency of deep reinforcement learning. Comparing with the traditional distributed DSA with deep learning, the proposed scheme has faster access convergence speed due to the characteristic of global optimization for federated learning. Based on this, a framework of federated deep reinforcement learning (FDRL) for DSA is proposed. Furthermore, this scheme preserves privacy of IoT users in that FDRL only requires model parameters to be uploaded to edge servers. Simulations are performed to show the effectiveness of theproposed FDRL-based DSA framework.
Read full abstract