Real-time data delivery is significant for the Industrial Internet of Things (IIoT). Age of information (AoI), a popular real-time metric, is usually used to measure the data freshness of the IIoT systems. If the data most recently received by the destination at time <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$t$</tex-math></inline-formula> was generated at time <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$t_{1}$</tex-math></inline-formula> , then the AoI is <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$t-t_{1}$</tex-math></inline-formula> . In this paper, we consider a multi-sensor multi-server IIoT system and develop scheduling algorithms to minimize the average AoI. The challenge lies in the strong coupling between link scheduling, server selection, and service preemption. To address this issue, we propose a guided exploration-based deep Q-Network (GE-DQN) algorithm utilizing a fixed advantage policy, which has a faster learning speed compared to classical deep Q-Network. Moreover, we use a shared decision module followed by several network branches to transform the structure of GE-DQN and propose a guided exploration-based Branching Dueling Q-Network (GE-BDQN) algorithm. Since the branch structure of GE-BDQN can decompose the high-dimensional action, GE-BDQN can reduce the approximate exponential growth of the number of output neurons with the increase of the number of sensors to linear growth compared to GE-DQN, ensuring the applicability of the algorithm under large-scale systems. From the simulation results, it can be found that the proposed two algorithms can achieve better average AoI compared to the advanced algorithms, and the GE-BDQN algorithm can achieve up to 36% performance gain.
Read full abstract