With the rapid increase in the number of Internet-of-Things (IoT) devices, the massive volume of data creates significant challenges for traditional cloud-based solutions. These solutions often lead to high latency, increased operational costs, and limited scalability, making them unsuitable for real-time applications and resource-constrained environments. As a result, edge, and fog computing have emerged as viable alternatives, reducing latency and costs by processing data closer to its source. However, managing the flow of such vast and distributed data streams requires well-structured data pipelines to control the complete lifecycle—from data acquisition at the source to processing at the edge and fog layers, and finally storage and analytics in the cloud. To dynamically handle data analytics at varying distances from the source, often on heterogeneous hardware devices with collaborative learning techniques such as Federated Learning (FL). FL enables decentralized model training by leveraging the local data on Edge Devices (EDs), thereby preserving data privacy and reducing communication overhead with the cloud. However, FL faces critical challenges, including data heterogeneity, where the non-independent and identically distributed (non-IID) nature of data degrades model performance, and resource limitations on EDs, which lead to inefficiencies in training and biases in the aggregated models.To address these issues, we propose a novel FL solution, called Pop-Up Federated Learning (PopFL) in edge networks. This solution introduces hierarchical aggregation to reduce network congestion by distributing the aggregation tasks across multiple Fog Servers (FSs), rather than relying solely on centralized cloud aggregation. To further enhance participation and resource utilization at the edge, we incorporate the Stackelberg game model, which incentivizes EDs based on their contribution and resource availability. Additionally, PopFL employs a pop-up ad-hoc network for scalable and efficient communication between EDs and FSs, ensuring robust data transmission in dynamic network conditions. Extensive experiments conducted on three diverse datasets highlight the superior performance of PopFL compared to state-of-the-art FL techniques. The results show significant improvements in model accuracy, robustness, and fairness across various scenarios, effectively addressing the challenges of data heterogeneity and resource limitations. Through these innovations, PopFL paves the way for more reliable and efficient distributed learning systems, unlocking the full potential of FL in real-world applications where low latency and scalable solutions are critical.
Read full abstract