The rapid progress in communication technologies and the widespread adoption of connected devices have given rise to various information-centric Internet-of-Things (IoT) systems, typically necessitating timely updates of information. Green IoT (G-IoT) strives to enhance the environment by reducing the power consumption of billions of devices engaged in extensive data exchange, addressing the substantial energy demand in the process. The Age of Correlated Information (AoCI) measures the freshness of information shared among two or more devices that contribute to the same decision-making process. Optimizing AoCI using Deep Reinforcement Learning (DRL) in IoT reduces energy consumption, optimizes resource utilization, and promotes environmentally conscious communication, contributing to the development of a sustainable G-IoT system. This paper focuses on scheduling the transmission of status update packets among interconnected G-IoT devices to minimize the application-specific long-term average AoCI. The problem is modeled as an NP-hard episodic Markov Decision Process (MDP), highlighting its computational complexity. To handle correlations and the curse of dimensionality, a multi-agent deep reinforcement learning algorithm, specifically the Multi-Agent Deep Deterministic Policy Grading (MADDPG) algorithm, is developed. The training progress displays episode rewards, with an environment designed to penalize multiple agents transmitting simultaneously or none at all, promoting cooperative behavior and minimizing the average age of correlated information. We provide comprehensive simulation results, including reward convergence, the learning process of actors and critics, and the resulting average AoCI with the number of episodes, demonstrating the effectiveness of the MADDPG algorithm.
Read full abstract