Exploring Visual Explanations for Defending Federated Learning against Poisoning Attacks: Enhancing LayerCAM with Autoencoders
Recent attacks on federated learning (FL) can introduce malicious model updates that can circumvent widely adopted Euclidean distance-based detection methods. This paper proposes a novel defense strategy, referred to as LayerCAM-AE, designed to counteract model poisoning in federated learning. The LayerCAM-AE puts forth a new Layer Class Activation Mapping (LayerCAM) integrated with an autoencoder (AE), significantly enhancing detection capabilities. Specifically, LayerCAM-AE generates a heat map for each local model update, which is then transformed into a more compact visual explanation. The autoencoder processes the LayerCAM heat maps from the local model updates, improving their distinctiveness and increasing the accuracy in spotting anomalous maps and malicious local models. To mitigate the risk of misclassifications in LayerCAM-AE, a voting algorithm is developed, where a local model update is flagged as malicious if its heat maps are consistently suspicious over several communication rounds. Extensive tests on the SVHN and CIFAR-100 datasets are performed under both Independent and Identically Distributed (IID) and non-IID settings in comparison with the state-of-the-art ResNet-50 and REGNETY-800MF defense models. The experimental results show that LayerCAM-AE increases detection rates (Recall: 1.0, Precision: 1.0, FPR: 0.0, Accuracy: 1.0, F1 score: 1.0, AUC: 1.0) and the test accuracy of FL, surpassing both the ResNet-50 and REGNETY-800MF. Our code is available at: https://github.com/jjzgeeks/LayerCAM-AE
- Research Article
- 10.1145/3765743
- Oct 13, 2025
- ACM Transactions on Privacy and Security
Recent poisoning attacks on federated learning (FL) generate malicious model updates that circumvent widely adopted Euclidean distance-based detection methods. This article proposes a new defense mechanism, namely, GradCAM-AE, against model poisoning attacks on FL, which integrates Gradient-weighted Class Activation Mapping (GradCAM) and autoencoder (AE) to offer a substantially more powerful detection capability compared to existing Euclidean distance-based approaches. Particularly, GradCAM-AE generates a heat map for each uploaded local model update, transforming each local model update into a lower-dimensional, visual representation. An AE further reprojects the GradCAM heat maps of all local module updates with improved distinguishability, thereby accentuating the hidden features of the heat maps and increasing the success rate of identifying anomalous heat maps and malicious local models. A comprehensive evaluation of the proposed GradCAM-AE framework is conducted using the CIFAR-10 and GTSRB datasets under both Independent and Identically Distributed (IID) and Non-IID settings. The ResNet-18 and MobileNetV3-Large models are tested. The results substantiate that GradCAM-AE offers superior detection rates and test accuracy of FL global model, juxtaposed with contemporary state-of-the-art methods. Our code is available at: https://github.com/jjzgeeks/GradCAM-AE .
- Conference Article
5
- 10.1109/gcwkshps56602.2022.10008615
- Dec 4, 2022
In this paper, a communication-efficient federated learning (FL) framework is proposed, which leverages ideas from vector quantized compressed sensing, for the first time, to compress the local model updates at wireless devices in FL. For the compression, each local model update is projected onto a lower dimensional space; then, the projected local model update is quantized by using a vector quantizer. The global model update at a parameter server is reconstructed by using a sparse signal recovery algorithm on the aggregation of the compressed local model updates. A key feature of our compression strategy is that the local model update after the projection is effectively modeled as a Gaussian random vector by the central limit theorem. Inspired by this feature, the optimal vector quantizer is derived for minimizing the compression error of the local model update. Simulation results on the MNIST dataset demonstrate that the proposed framework that uses 0.5 bit to represent each local model update entry shows less than a 1% decrease in classification accuracy compared to FL without local update compression.
- Research Article
- 10.52783/jisem.v10i51s.10438
- May 30, 2025
- Journal of Information Systems Engineering and Management
Centralized machine learning requires the centralization of data in one server for model training, the data of individuals must be transmitted to the centralized server using its raw form which resulting in serious privacy and security concerns. Federated learning is a decentralization machine learning technique which improves the issues of security and privacy related to traditional machine learning by enabling local model training on devices without sharing raw data with the centralized server. Federated learning includes multiple clients and one central server. Clients perform training on its own data while the server coordinates the overall federated learning process. In federated learning, raw data never leaves its own place, ensuring data confidentiality. Only local model updates, form each client are transmitted to the central server that organizes the learning process. The server performs aggregation on received local model updates. Following the aggregation process, the global model is then updated by the server. The final global model is used then for evaluation. However federated learning improves privacy along with security of centralized machine learning, it is still targeted by attacks through model updates transmitted between clients and server. To improve privacy along with security related to federated learning, privacy preservation techniques are integrated with federated learning. We propose a survey of privacy preservation techniques combined with federated learning to improve privacy and security and achieve a good balance between utility and privacy. Private Aggregation of Teacher Ensembles, Homomorphic Encryption, as well as Secure Multi-Party Computation represent the most popular used privacy preservation techniques with federated learning for malicious behavior detection.
- Research Article
10
- 10.3390/info16030244
- Mar 18, 2025
- Information
Federated learning (FL) is a machine learning technique where clients exchange only local model updates with a central server that combines them to create a global model after local training. While FL offers privacy benefits through local training, privacy-preserving strategies are needed since model updates can leak training data information due to various attacks. To enhance privacy and attack robustness, techniques like homomorphic encryption (HE), Secure Multi-Party Computation (SMPC), and the Private Aggregation of Teacher Ensembles (PATE) can be combined with FL. Currently, no study has combined more than two privacy-preserving techniques with FL or comparatively analyzed their combinations. We conducted a comparative study of privacy-preserving techniques in FL, analyzing performance and security. We implemented FL using an artificial neural network (ANN) with a Malware Dataset from Kaggle for malware detection. To enhance privacy, we proposed models combining FL with the PATE, SMPC, and HE. All models were evaluated against poisoning attacks (targeted and untargeted), a backdoor attack, a model inversion attack, and a man in the middle attack. The combined models maintained performance while improving attack robustness. FL_SMPC, FL_CKKS, and FL_CKKS_SMPC improved both their performance and attack resistance. All the combined models outperformed the base FL model against the evaluated attacks. FL_PATE_CKKS_SMPC achieved the lowest backdoor attack success rate (0.0920). FL_CKKS_SMPC best resisted untargeted poisoning attacks (0.0010 success rate). FL_CKKS and FL_CKKS_SMPC best defended against targeted poisoning attacks (0.0020 success rate). FL_PATE_SMPC best resisted model inversion attacks (19.267 MSE). FL_PATE_CKKS_SMPC best defended against man in the middle attacks with the lowest degradation in accuracy (1.68%), precision (1.94%), recall (1.68%), and the F1-score (1.64%).
- Research Article
31
- 10.1109/access.2021.3128622
- Jan 1, 2021
- IEEE Access
Federated Learning (FL) relies on on-device training to avoid the migration of devices’ data to a centralized server to address privacy leakage. Moreover, FL is feasible for scenarios (e.g., autonomous cars) where an enormous amount of data is generated every day. Transferring only local model updates in the case of FL is highly communication-efficient compared to transferring all data in the case of centralized machine learning (ML). Although FL offers many advantages, it also has some challenges. A malicious aggregation server can infer device information via local model updates. Another downside of FL is the centralized aggregation server that can malfunction due to an attack or physical damage. To address these issues, we propose a novel Structured Transparency empowered cross-silo Federated Learning on the Blockchain (ST-BFL) framework. In ST-BFL, homomorphic encryption, FL-aggregators, FL-verifiers, and smart contract are employed, which satisfy various structured transparency components, such as input privacy, output privacy, output verification, and flow governance. We present the framework architecture, algorithms, and sequence diagram of our ST-BFL framework to show how different entities interact in ST-BFL for the FL process. We also present a simplified class diagram of ST-BFL’s smart contract for an FL task. Finally, we perform a simulation to analyze our framework from the perspective of aggregation time, accuracy, and storage size. The qualitative and quantitative evaluation shows that ST-BFL has the same accuracy as traditional FL. However, ST-BFL provides input privacy, output privacy, input verification, output verification, and flow governance at the expense of relatively higher computation and communication costs than traditional FL.
- Research Article
2
- 10.3390/e26080712
- Aug 21, 2024
- Entropy (Basel, Switzerland)
With the rapid advancement of the Internet and big data technologies, traditional centralized machine learning methods are challenged when dealing with large-scale datasets. Federated Learning (FL), as an emerging distributed machine learning paradigm, enables multiple clients to collaboratively train a global model while preserving privacy. Edge computing, also recognized as a critical technology for handling massive datasets, has garnered significant attention. However, the heterogeneity of clients in edge computing environments can severely impact the performance of the resultant models. This study introduces an Adaptive Personalized Client-Selection and Model-Aggregation Algorithm, APCSMA, aimed at optimizing FL performance in edge computing settings. The algorithm evaluates clients' contributions by calculating the real-time performance of local models and the cosine similarity between local and global models, and it designs a ContriFunc function to quantify each client's contribution. The server then selects clients and assigns weights during model aggregation based on these contributions. Moreover, the algorithm accommodates personalized needs in local model updates, rather than simply overwriting with the global model. Extensive experiments were conducted on the FashionMNIST and Cifar-10 datasets, simulating three data distributions with parameters dir = 0.1, 0.3, and 0.5. The accuracy improvements achieved were 3.9%, 1.9%, and 1.1% for the FashionMNIST dataset, and 31.9%, 8.4%, and 5.4% for the Cifar-10 dataset, respectively.
- Research Article
24
- 10.1109/twc.2023.3291877
- Mar 1, 2024
- IEEE Transactions on Wireless Communications
In this paper, a new communication-efficient federated learning (FL) framework is proposed, inspired by vector quantized compressed sensing. The basic strategy of the proposed framework is to compress the local model update at each device by applying dimensionality reduction followed by vector quantization. Subsequently, the global model update is reconstructed at a parameter server by applying a sparse signal recovery algorithm to the aggregation of the compressed local model updates. By harnessing the benefits of both dimensionality reduction and vector quantization, the proposed framework effectively reduces the communication overhead of local update transmissions. Both the design of the vector quantizer and the key parameters for the compression are optimized so as to minimize the reconstruction error of the global model update under the constraint of wireless link capacity. By considering the reconstruction error, the convergence rate of the proposed framework is also analyzed for a non-convex loss function. Simulation results on the MNIST and FEMNIST datasets demonstrate that the proposed framework can improve classification accuracy by more than 2.4% compared to state-of-the-art FL frameworks when the communication overhead of the local model update transmission is 0.1 bit per local model entry.
- Research Article
234
- 10.1109/twc.2021.3052681
- Jan 28, 2021
- IEEE Transactions on Wireless Communications
We study federated learning (FL) at the wireless edge, where power-limited devices with local datasets collaboratively train a joint model with the help of a remote parameter server (PS). We assume that the devices are connected to the PS through a bandwidth-limited shared wireless channel. At each iteration of FL, a subset of the devices are scheduled to transmit their local model updates to the PS over orthogonal channel resources, while each participating device must compress its model update to accommodate to its link capacity. We design novel scheduling and resource allocation policies that decide on the subset of the devices to transmit at each round, and how the resources should be allocated among the participating devices, not only based on their channel conditions, but also on the significance of their local model updates. We then establish convergence of a wireless FL algorithm with device scheduling, where devices have limited capacity to convey their messages. The results of numerical experiments show that the proposed scheduling policy, based on both the channel conditions and the significance of the local model updates, provides a better long-term performance than scheduling policies based only on either of the two metrics individually. Furthermore, we observe that when the data is independent and identically distributed (i.i.d.) across devices, selecting a single device at each round provides the best performance, while when the data distribution is non-i.i.d., scheduling multiple devices at each round improves the performance. This observation is verified by the convergence result, which shows that the number of scheduled devices should increase for a less diverse and more biased data distribution.
- Research Article
69
- 10.1609/aaai.v36i8.20903
- Jun 28, 2022
- Proceedings of the AAAI Conference on Artificial Intelligence
Federated learning (FL) is a privacy-preserving distributed machine learning paradigm that enables multiple clients to collaboratively train statistical models without disclosing raw training data. However, the inaccessible local training data and uninspectable local training process make FL susceptible to various Byzantine attacks (e.g., data poisoning and model poisoning attacks), aiming to manipulate the FL model training process and degrade the model performance. Most of the existing Byzantine-robust FL schemes cannot effectively defend against stealthy poisoning attacks that craft poisoned models statistically similar to benign models. Things worsen when many clients are compromised or data among clients are highly non-independent and identically distributed (non-IID). In this work, to address these issues, we propose FedInv, a novel Byzantine-robust FL framework by inversing local model updates. Specifically, in each round of local model aggregation in FedInv, the parameter server first inverses the local model updates submitted by each client to generate a corresponding dummy dataset. Then, the server identifies those dummy datasets with exceptional Wasserstein distances from others and excludes the related local model updates from model aggregation. We conduct an exhaustive experimental evaluation of FedInv. The results demonstrate that FedInv significantly outperforms the existing robust FL schemes in defending against stealthy poisoning attacks under highly non-IID data partitions.
- Research Article
18
- 10.1109/jiot.2022.3184812
- Nov 1, 2022
- IEEE Internet of Things Journal
Federated learning (FL) has emerged to leverage datasets from multiple devices to improve the performance of a machine learning (ML) model while providing privacy preservation for devices. The training data is collected at the devices, also known as FL workers, which collaboratively train a global learning model and share their local model updates with a central entity or server without sharing their data. However, FL can be susceptible to various adversarial attacks that target its security and privacy. In particular, the workers can upload unreliable local model updates, leading to corruption of the main FL task. Workers may intentionally contribute unreliable local updates by launching poisoning attacks or unintentionally by updating low-quality models caused by high device mobility, limited device resources, or unstable network connection. Consequently, identifying reliable and trustworthy workers becomes critical for FL security. In this article, the concept of reputation is adopted as a metric to evaluate workers’ reliability and trustworthiness. In addition, deep reinforcement learning (DRL)-based reputation mechanism is proposed for optimal selection and evaluation of reliable FL workers. Due to the dynamic nature of worker behavior in the FL environment, the DRL-based algorithm deep deterministic policy gradient (DDPG) is employed to improve the FL model accuracy and stability. We compare the performance of our proposed method with a conventional reputation method and deep <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula> -networks (DQNs)-based reputation method. Our simulation results demonstrate that our proposed method can improve FL accuracy by more than 30% under various scenarios and achieves better convergence than the other methods.
- Research Article
7
- 10.1109/jiot.2024.3399259
- Aug 15, 2024
- IEEE Internet of Things Journal
Federated learning (FL) in Internet of Things (IoT) applications facilitates the collaborative training of a global model across distributed devices with a server. Despite its potential, the distributed nature and vulnerability of IoT devices render FL susceptible to Byzantine attacks. Existing approaches to counter these attacks are often impractical in real-world IoT scenarios, mainly due to the challenges posed by nonindependent identically distributed (non-IID) data and the high-dimensional model common in IoT devices. To address these challenges, we propose Guard-FL, an efficient and robust aggregation mechanism assisted by uniform manifold approximation and projection (UMAP) for FL. Guard-FL is designed to enhance the performance of the global model in non-IID data environments without compromising defense capabilities. Specifically, it utilizes UMAP to capture non-linear features among high-dimensional local models. Based on these features, robust regression and unsupervised clustering techniques are applied to effectively detect and remove attackers from local model updates. Subsequently, the server employs information stored in weights to evaluate and aggregate the remaining divergent model updates, thus significantly improving the global models performance. To validate the efficacy of Guard-FL, we provide a theoretical analysis of its convergence properties. Our experiments demonstrate that Guard-FL surpasses existing stateof-the-art solutions, achieving up to 96% accuracy in detecting malicious clients on non-IID CIFAR-10 datasets under various Byzantine attack scenarios. The implementation code is provided at https://github.com/XidianNSS/Guard-FL.git
- Research Article
18
- 10.1109/tsp.2021.3125137
- Jan 1, 2021
- IEEE Transactions on Signal Processing
Federated averaging (FedAvg) is a popular federated learning (FL) technique that updates the global model by averaging local models and then transmits the updated global model to devices for their local model update. One main limitation of FedAvg is that the average-based global model is not necessarily better than local models in the early stage of the training process so that FedAvg might diverge in realistic scenarios, especially when the data is non-identically distributed across devices and the number of data samples varies significantly from device to device. In this paper, we propose a new FL technique based on simulated annealing. The key idea of the proposed technique, henceforth referred to as \textit{simulated annealing-based FL} (SAFL), is to allow a device to choose its local model when the global model is immature. Specifically, by exploiting the simulated annealing strategy, we make each device choose its local model with high probability in early iterations when the global model is immature. From extensive numerical experiments using various benchmark datasets, we demonstrate that SAFL outperforms the conventional FedAvg technique in terms of the convergence speed and the classification accuracy.
- Research Article
5
- 10.1109/tdsc.2024.3521297
- May 1, 2025
- IEEE Transactions on Dependable and Secure Computing
Federated learning (FL) has been shown vulnerable to a new class of adversarial attacks, known as <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">model poisoning attacks (MPA)</i>, where one or more malicious clients try to poison the global model by sending carefully crafted local model updates to the central parameter server. Existing defenses that have been fixated on analyzing model parameters show limited effectiveness in detecting such malicious models. In this work, we propose <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FLARE</small>, a robust model aggregation mechanism for FL, which is resilient against state-of-the-art MPAs. Instead of solely depending on model parameters, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FLARE</small> leverages the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">penultimate layer representations (PLRs)</i> of the model for characterizing the adversarial influence on each local model update. We further propose a trust evaluation method that estimates a trust score for each model update based on pairwise PLR discrepancies among all model updates. Under the assumption of honest majority, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FLARE</small> assigns a low trust score to model updates that are far from the benign cluster. <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FLARE</small> then aggregates the model updates weighted by their trust scores and finally updates the global model. Extensive experimental results demonstrate the effectiveness of <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FLARE</small> in defending FL against various MPAs, including semantic backdoor attacks, trojan backdoor attacks, and untargeted attacks, in various FL systems.
- Conference Article
1
- 10.1109/tpsisa52974.2021.00004
- Dec 1, 2021
Federated Learning (FL) is a multiparty learning computing approach that can aid privacy-preservation machine learning. However, FL has several potential security and privacy threats. First, the existing FL requires a central coordinator for the learning process which brings a single point of failure and trust issues for the shared trained model. Second, during the learning process, intentionally unreliable model updates performed by Byzantine colluding parties can lower the quality and convergence of the shared ML models. Therefore, discovering verifiable local model updates (i.e., integrity or correctness) and trusted parties in FL becomes crucial. In this paper, we propose a resilient and verifiable FL algorithm based on a reputation scheme to cope with unreliable parties. We develop a selection algorithm for task publisher and blockchain-based multiparty learning architecture approach where local model updates are securely exchanged and verified without the central party. We also proposed a novel auditing scheme to ensure our proposed approach is resilient up to 50% Byzantine colluding attack in a malicious scenario.
- Research Article
7
- 10.1109/ojcoms.2025.3558672
- Jan 1, 2025
- IEEE Open Journal of the Communications Society
Wireless Federated Learning (FL) is a distributed Artificial Intelligence (AI) framework, enabling decision-making at the network edge where data are generated. However, wireless transmissions of model updates from edge nodes to the coordinating server are vulnerable to jamming, alongside the inherent risk of poisoning the learning process. In this paper, we tackle the problem of coordinated jamming and poisoning attacks in wireless FL networks, where malicious edge nodes disrupt transmissions of legitimate local model updates to the cloud server while injecting poisoned model updates to manipulate the global model. To this end, we introduce two complementary mechanisms operating alternately. First, a robust global model aggregation algorithm is developed to address poisoning attacks by weighting edge nodes’ local model updates using a novel contribution index. The calculation of the index is inspired by the Shapley value, but it offers polynomial complexity compared to existing methods. Subsequently, a distributed power control solution for jamming attack mitigation in the uplink of the FL network is introduced based on Bayesian games with incomplete information. Both legitimate and malicious nodes aim to successfully transmit their model parameters, minimizing transmission power and time to the server, while having probabilistic knowledge about the malicious behavior of the other nodes in the game. The proposed unified approach and each individual mechanism are assessed via modeling and simulation, verifying their effectiveness in mitigating both attacks while achieving a good tradeoff between global model accuracy and consumed time and energy compared to state-of-the-art approaches.