Multi-granularity fusion resource allocation algorithm based on dual-attention deep reinforcement learning and lifelong learning architecture in heterogeneous IIoT

Ying Wang,Fengjun Shang,Jianjun Lei

doi:10.1016/j.inffus.2023.101871

Abstract

Deep reinforcement learning (DRL) is a promising technology to address the resource allocation problem for efficient data transmission in complex network environments. However, most DRL-based resource allocation algorithms suffer from limited feature extraction capabilities and lack scalability and generalization, especially in heterogeneous Industrial Internet of Things (IIoT) environments. In this paper, we develop a lifelong learning architecture that can integrate artificial intelligence (AI) algorithms into the heterogeneous IIoT network for efficient data transmission. Based on this, we propose an intelligent resource allocation algorithm based on dual-attention DRL (DADR) for forwarding node selection and channel access slot allocation in a specific network environment. The proposed DADR algorithm combines the advantages of multi-dimension convolutional attention and multi-head self-attention mechanisms. It can provide local- and global-feature fusion capabilities for distributed nodes while maximizing the performance of data transmission. Furthermore, we present a lifelong federated meta reinforcement learning (LFMRL) that can effectively utilize prior knowledge and enable the DRL agent quickly adapt to a new environment. Specifically, LFMRL adopts a federated meta learning-based knowledge fusion algorithm to fuse the knowledge of learned DADR algorithms and iteratively update the shared model, thereby improving the scalability and generalization of the shared model in heterogeneous IIoT environments. In addition, a simple and efficient knowledge transfer mechanism is enabled to accelerate the DRL model convergence by transferring the knowledge of the shared model to the new environment. Simulation results demonstrate the effectiveness of the proposed algorithms in terms of energy efficiency, data transmission reliability, and network stability. Compared to DADR and FedAvg algorithms, LFMRL algorithm can further reduce the energy consumption, training time, and average forwarding node switching times, while improving packet delivery rate to 99.2%.

Full Text