In Internet-of-Things (IoT), blooming IoT terminals along with growing demand for broadband services, such as video stream and virtual reality (VR) applications, have led the enhancement of spectrum efficiency to the research focus in IoT. The technology of dynamic spectrum access or sharing is commonly utilized to address this challenge. With the development of artificial intelligence (AI), there has been an increasing amount of research on dynamic spectrum access based on machine learning models, such as Q-learning, deep reinforcement learning and federated learning. When using federated learning to secure user privacy in dynamic spectrum access, several key challenges such as the lack of personalized model selection and parameter updates, and the difficulty of meeting the needs of different devices with a single global model are still under-investigated. In this paper, the heterogeneity of IoT users is considered and a hierarchical and personalized federated deep reinforcement learning approach is proposed to fulfill dynamic spectrum access. First, IoT users are trained locally by using deep reinforcement learning to obtain optimal rewards and spectrum access rate. Second, considering that users have their personalized characteristics, a client-edge-cloud federated learning framework is proposed to accelerate the model convergence and reduce communication overhead between cloud servers and clients. Accordingly, user parameters are divided into personalized and basic parameters. Basic parameters are individually uploaded for global model training, meanwhile personalized parameters are retained to guarantee the applicability of the model. Lastly, the proposed framework is used for dynamic spectrum access, thereby improving the efficiency of spectrum access and decrease the bandwidth occupied by exploiting the global optimization capabilities for federated learning. Only basic parameters are uploaded to substantially protect user’s privacy. Simulation results show that the proposed framework can improve user communication performances and the accuracy of spectrum access. The average convergency time for IoT users can be reduced by about 40% compared with common federated-learning-oriented dynamic spectrum access.
Read full abstract