The increasing demand for processing numerous data from IoT devices in a hierarchical IoT network drives researchers to propose different resource allocation methods in the edge hosts efficiently. Traditional approaches often compromise on one of these aspects: either prioritizing local decision-making at the edge, which lacks global system insights or centralizing decisions in cloud systems, which raises privacy concerns. Additionally, most solutions do not consider scheduling tasks at the same time to effectively complete the prioritized task accordingly. This study introduces the hierarchical adaptive federated reinforcement learning (HAFedRL) framework for robust resource allocation and task scheduling in hierarchical IoT networks. At the local edge host level, a primal–dual update based deep deterministic policy gradient (DDPG) method is introduced for effective individual task resource allocation and scheduling. Concurrently, the central server utilizes an adaptive multi-objective policy gradient (AMOPG) which integrates a multi-objective policy adaptation (MOPA) with dynamic federated reward aggregation (DFRA) method to allocate resources across connected edge hosts. An adaptive learning rate modulation (ALRM) is proposed for faster convergence and to ensure high performance output from HAFedRL. Our proposed HAFedRL enables the effective integration of reward from edge hosts, ensuring the alignment of local and global optimization goals. The experimental results of HAFedRL showcase its efficacy in improving system-wide utility, average task completion rate, and optimizing resource utilization, establishing it as a robust solution for hierarchical IoT networks.
Read full abstract