Introduction to the Special Issue on Performance Evaluation of Federated Learning Systems Part 2
Introduction to the Special Issue on Performance Evaluation of Federated Learning Systems Part 2
- Research Article
613
- 10.1109/tkde.2021.3124599
- Apr 1, 2023
- IEEE Transactions on Knowledge and Data Engineering
Federated learning has been a hot research topic in enabling the collaborative training of machine learning models among different organizations under the privacy restrictions. As researchers try to support more machine learning models with different privacy-preserving approaches, there is a requirement in developing systems and infrastructures to ease the development of various federated learning algorithms. Similar to deep learning systems such as PyTorch and TensorFlow that boost the development of deep learning, federated learning systems (FLSs) are equivalently important, and face challenges from various aspects such as effectiveness, efficiency, and privacy. In this survey, we conduct a comprehensive review on federated learning systems. To achieve smooth flow and guide future research, we introduce the definition of federated learning systems and analyze the system components. Moreover, we provide a thorough categorization for federated learning systems according to six different aspects, including data distribution, machine learning model, privacy mechanism, communication architecture, scale of federation and motivation of federation. The categorization can help the design of federated learning systems as shown in our case studies. By systematically summarizing the existing federated learning systems, we present the design factors, case studies, and future research opportunities.
- Conference Article
10
- 10.1109/globecom48099.2022.10000743
- Dec 4, 2022
Federated Learning (FL) is considered the key approach for privacy-preserving, distributed machine learning (ML) systems. However, due to the transmission of large ML models from users to the server in each iteration of FL, communication on resource-constrained networks is currently a fundamental bottleneck in FL, restricting the ML model complex-ity and user participation. One of the notable trends to reduce the communication cost of FL systems is gradient compression, in which techniques in the form of sparsification or quantization are utilized. However, these methods are pre-fixed and do not capture the redundant, correlated information across parameters of the ML models, user devices' data, and iterations of FL. Further, these methods do not fully take advantage of the error-correcting capability of the FL process. In this paper, we propose the Federated Learning with Autoencoder Compression (FLAC) approach that utilizes the redundant information and error-correcting capability of FL to compress user devices' models for uplink transmission. FLAC trains an autoencoder to encode and decode users' models at the server in the Training State, and then, sends the autoencoder to user devices for compressing local models for future iterations during the Compression State. To guarantee the convergence of the FL, FLAC dynamically controls the autoencoder error by switching between the Training State and Compression State to adjust its autoencoder and its compression rate based on the error tolerance of the FL system. We theoretically prove that FLAC converges for FL systems with strongly convex ML models and non-i.i.d. data distribution. Our extensive experimental results'over three datasets with different network architectures show that FLAC can achieve compression rates ranging from 83x to 875x while staying near 7 percent of the accuracy of the non-compressed FL systems.
- Conference Article
10
- 10.1109/ccnc49033.2022.9700513
- Jan 8, 2022
Federated learning is a collaborative/distributed machine learning system which is designed to address the privacy issues in centralized machine learning systems. The transparency and provenance of a machine learning model are important aspects of federated learning systems since they impact peoples’ lives in various domains (e.g., from healthcare to personal finance to employment). However, most of the existing federated learning systems deal with centralized coordinators which are vulnerable to attacks and privacy breaches. Also, they do not provide any standard transparency and provenance mechanisms for the resulting models. In this paper, we propose a blockchain and Model Card-based integrated federated learning system "Bassa-ML" providing enhanced transparency and trust for the models. Model parameter sharing, local model generation, model averaging, and model sharing functions are implemented using smart contracts. The generated models, model training information, and model reports are stored in the blockchain ledger as Model Card Objects. This results in enhanced transparency and auditability to the federated learning process.
- Conference Article
6
- 10.1109/iccspa55860.2022.10019204
- Dec 27, 2022
Federated Learning (FL) is considered the key, enabling approach for privacy-preserving, distributed machine learning (ML) systems. FL requires the periodic transmission of ML models from users to the server. Therefore, communication via resource-constrained networks is currently a fundamental bottleneck in FL, which is restricting the ML model complexity and user participation. One of the notable trends to reduce the communication cost of FL systems is gradient compression, in which techniques in the form of sparsification are utilized. However, these methods utilize a single compression rate for all users and do not consider communication heterogeneity in a real-world FL system. Therefore, these methods are bottlenecked by the worst communication capacity across users. Further, sparsification methods are non-adaptive and do not utilize the redundant, similar information across users' ML models for compression. In this paper, we introduce a novel Dynamic Sparsification for Federated Learning (DSFL) approach that enables users to compress their local models based on their communication capacity at each iteration by using two novel sparsification methods: layer-wise similarity sparsification (LSS) and extended top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> sparsification. LSS enables DSFL to utilize the global redundant information in users' models by using the Centralized Kernel Alignment (CKA) similarity for sparsification. The extended top- <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> model sparsification method empowers DSFL to accommodate the heterogeneous communication capacity of user devices by allowing different values of sparsification rate <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$K$</tex> for each user at each iteration. Our extensive experimental results <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> All code and experiments are publicly available at: https://github.com/mahdibeit/DSFL. on three datasets show that DSFL has a faster convergence rate than fixed sparsification, and as the communication heterogeneity increases, this gap increases. Further, our thorough experimental investigations uncover the similarities of user models across the FL system.
- Conference Article
5
- 10.1109/mass56207.2022.00015
- Oct 1, 2022
In cross-silo federated learning (FL), organizations cooperatively train a global model with their local data. The organizations, however, own different datasets and may be heterogeneous in terms of their expectation on the precision of the global model. Meanwhile, the cost of secure global model aggregation, including computation and communication, is proportional to the square of the number of organizations in the FL system. In this paper, we consider all organizations in the FL system as a grand coalition. We introduce a novel concept from coalition game theory which allows the dynamic formation of coalitions among organizations. A simple and distributed merge and split algorithm for coalition formation is constructed. The aim is to find an ultimate coalition structure that allows cooperating organizations to maximize their utilities in consideration of the coalition formation cost. Through this novel game theoretical framework, the FL system is able to self-organize and form a structured network composed of disjoint stable coalitions. To fairly distribute cost in each formed coalition, a cost sharing mechanism is proposed to align members' individual utility with their coalition's utility. In FL systems, training data has a significant impact on model performances, i.e., it should lead to a more precise global model if organizations with greater data complementarity are grouped. Numerical evaluations are presented to verify the proposed models.
- Research Article
57
- 10.1016/j.jss.2022.111357
- May 7, 2022
- Journal of Systems and Software
Architectural patterns for the design of federated learning systems
- Research Article
1
- 10.1016/j.cose.2024.103936
- Jun 4, 2024
- Computers & Security
FedIMP: Parameter Importance-based Model Poisoning attack against Federated learning system
- Research Article
1
- 10.1016/j.comnet.2024.110691
- Aug 5, 2024
- Computer Networks
TPE-BFL: Training Parameter Encryption scheme for Blockchain based Federated Learning system
- Book Chapter
16
- 10.1007/978-3-030-86044-8_6
- Jan 1, 2021
Federated learning is an emerging machine learning paradigm that enables multiple devices to train models locally and formulate a global model, without sharing the clients’ local data. A federated learning system can be viewed as a large-scale distributed system, involving different components and stakeholders with diverse requirements and constraints. Hence, developing a federated learning system requires both software system design thinking and machine learning knowledge. Although much effort has been put into federated learning from the machine learning perspectives, our previous systematic literature review on the area shows that there is a distinct lack of considerations for software architecture design for federated learning. In this paper, we propose FLRA, a reference architecture for federated learning systems, which provides a template design for federated learning-based solutions. The proposed FLRA reference architecture is based on an extensive review of existing patterns of federated learning systems found in the literature and existing industrial implementation. The FLRA reference architecture consists of a pool of architectural patterns that could address the frequently recurring design problems in federated learning architectures. The FLRA reference architecture can serve as a design guideline to assist architects and developers with practical solutions for their problems, which can be further customised.
- Book Chapter
- 10.3233/faia250917
- Oct 21, 2025
Neural networks unintentionally memorize training data, creating privacy risks in federated learning (FL) systems, such as inference and reconstruction attacks on sensitive data. To mitigate these risks and to comply with privacy regulations, Federated Unlearning (FU) has been introduced to enable participants in FL systems to remove their data’s influence from the global model. However, current FU methods primarily act post-hoc, struggling to efficiently erase information deeply memorized by neural networks. We argue that effective unlearning necessitates a paradigm shift: designing FL systems inherently amenable to forgetting. To this end, we propose a learning-to-unlearn Transformation-guided Federated Unlearning (ToFU) framework that incorporates transformations during the learning process to reduce memorization of specific instances. Our theoretical analysis reveals how transformation composition provably bounds instance-specific information, directly simplifying subsequent unlearning. Crucially, ToFU can work as a plug-and-play framework that improves the performance of existing FU methods. Experiments on CIFAR-10, CIFAR-100, and the MUFAC benchmark show that ToFU outperforms existing FU baselines, enhances performance when integrated with current methods, and reduces unlearning time.
- Research Article
7
- 10.1109/ojcoms.2023.3266389
- Jan 1, 2023
- IEEE Open Journal of the Communications Society
This paper studies a bandwidth-limited federated learning (FL) system where the access point is a central server for aggregation and the energy-constrained user equipemnts (UEs) with limited computation capabilities (e.g., Internet of Things devices) perform local training. Limited by the bandwidth in wireless edge systems, only a part of UEs can participate in each FL training round. Selecting different UEs could affect the FL performance, and selected UEs need to allocate their computing resource effectively. In wireless edge FL systems, simultaneously accelerating FL training and reducing computing-communication energy consumption are of importance. To this end, we formulate a multi-objective optimization problem (MOP). In MOP, the model training convergence is difficult to calculate accurately. Meanwhile, MOP is a combinatorial optimization problem, with the high-dimension mix-integer variables, which is proved to be NP-hard. To address these challenges, a multi-objective evolutionary algorithm for the bandwidth-limited FL system (MOEA-FL) is proposed to obtain a Pareto optimal solution set. In MOEA-FL, an age-of-update-loss method is first proposed to transform the original global loss function into a convergence reference function. Then, MOEA-FL divides MOP into N single objective subproblems by the Tchebycheff approach and optimizes the subproblems simultaneously by evolving a population. Extensive experiments have been carried out on MNIST dataset and a medical case called TissueMNIST dataset for both the i.i.d and non-i.i.d data setting. Experimental results demonstrate that MOEA-FL performs better than other algorithms and verify the robustness and scalability of MOEA-FL.
- Research Article
1
- 10.3390/electronics12040842
- Feb 7, 2023
- Electronics
In this work, we formalize the concept of differential model robustness (DMR), a new property for ensuring model security in federated learning (FL) systems. For most conventional FL frameworks, all clients receive the same global model. If there exists a Byzantine client who maliciously generates adversarial samples against the global model, the attack will be immediately transferred to all other benign clients. To address the attack transferability concern and improve the DMR of FL systems, we propose the notion of differential model distribution (DMD) where the server distributes different models to different clients. As a concrete instantiation of DMD, we propose the ARMOR framework utilizing differential adversarial training to prevent a corrupted client from launching white-box adversarial attack against other clients, for the local model received by the corrupted client is different from that of benign clients. Through extensive experiments, we demonstrate that ARMOR can significantly reduce both the attack success rate (ASR) and average adversarial transfer rate (AATR) across different FL settings. For instance, for a 35-client FL system, the ASR and AATR can be reduced by as much as 85% and 80% over the MNIST dataset.
- Conference Article
4
- 10.1109/services51467.2021.00034
- Sep 1, 2021
Federated Learning (FL) deployed in edge computing may achieve some advantages such as private data protection, communication cost reduction, and lower training latency compared to cloud-centric training approaches. The Anything-as-a-Service (XaaS) paradigm, as the main service provisioning model in edge computing, enables various flexible FL deployments. On the other hand, the distributed nature of FL together with the highly diverse computing and networking infrastructures in an edge environment introduce extra latency that may degrade FL performance. Therefore, delay performance evaluation on edge-based FL systems becomes an important research topic. However, XaaS-based FL deployment brings new challenges to performance analysis that cannot be well addressed by conventional analytical approaches. In this paper, we attempt to address such challenges by proposing a profile-based modeling and analysis method for evaluating delay performance of edge-based FL systems. The insights obtained from the modeling and analysis may offer useful guidelines to various aspects of FL design. Application of network calculus techniques makes the proposed method general and flexible, thus may be applied to FL systems deployed upon the heterogeneous edge infrastructures.
- Research Article
200
- 10.1016/j.knosys.2021.107338
- Jul 28, 2021
- Knowledge-Based Systems
A federated learning system with enhanced feature extraction for human activity recognition
- Conference Article
15
- 10.1109/cvprw56347.2022.00021
- Jun 1, 2022
Adversarial Training (AT) is crucial for obtaining deep neural networks that are robust to adversarial attacks, yet recent works found that it could also make models more vulnerable to privacy attacks. In this work, we further reveal this unsettling property of AT by designing a novel privacy attack that is practically applicable to the privacy-sensitive Federated Learning (FL) systems. Using our method, the attacker can exploit AT models in the FL system to accurately reconstruct users’ private training images even when the training batch size is large. Code is available at https://github.com/zjysteven/PrivayAttack_AT_FL.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.