Abstract

In the rapidly expanding field of IoT, data production has reached an unprecedented scale, providing valuable insights that accelerate decision-making processes. However, ensuring the privacy and security of this massive amount of data poses significant challenges. In this paper, we propose using clustered federated learning (CFL) as a solution to ensure both the security and privacy of big data by uploading model weights while keeping the data stored locally. Nevertheless, there are practical challenges in applying CFL to big data: (1) the participating FL clients are unlikely to have identical data distributions; (2) insufficient attention is given to the similarity between different clusters; and (3) CFL tends to ignore the class imbalance problem (i.e., long-tailed), which hinders its application in big data and affects the quality of target tasks. To address these issues and enable widespread CFL deployment in big data applications, this paper proposes a prototype-assisted clustered federated learning framework (MDSPFL). It relaxes the assumption of unique data distribution for each client, allowing the client’s local dataset to follow multiple source distributions considering classification class imbalance, thereby aligning with clients in a big data environment. Specifically, MDSPFL employs the proximal update mechanism to handle workload surges caused by mixed distribution and unavailability of similarity between cluster models. Additionally, MDSPFL introduces a class-balanced local training mechanism to resolve the long-tailed problem, which utilizes contrastive learning and class prototypes to enforce a uniform distribution of all classes in the feature space. We conduct extensive experiments on different datasets (EMNIST, Cifar10, Cifar100), and the experimental results demonstrate the effectiveness of our proposed MDSPFL in big data scenarios with imbalance and mixed-distribution clients.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.