Blockchain-Based Decentralized and Lightweight Anonymous Authentication for Federated Learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Federated learning (FL) is a promising technology for achieving privacy-preserving edge intelligence and has attracted extensive attention from industry and academia. However, in the FL training process, the server directly aggregates local models from mobile devices, which poses serious privacy and security threats. The identity authentication mechanism can provide FL with local model integrity and source authentication. However, the existing schemes are centralized, and most of them are computationally expensive, resulting in limited performance. To address these issues, this paper proposes a decentralized and lightweight anonymous FL identity authentication scheme, namely DAFL. In our scheme, we first design a decentralized and simplified storage FL authentication framework by combining the directed acyclic graph (DAG) blockchain and accumulator. Then, we propose a lightweight digital signature algorithm that supports batch verification for authentication. Finally, nodes interact through pseudonyms to achieve anonymous communication, and the trusted authority (TA) can track and recover the real identities of nodes when malicious behavior occurs. We theoretically prove the security of the proposed DAFL. The extensive experiments demonstrate that DAFL achieves lower authentication overhead and better convergence performance compared to existing authentication schemes and vanilla FL systems.

Similar Papers
  • Conference Article
  • Cite Count Icon 10
  • 10.1109/globecom48099.2022.10000743
FLAC: Federated Learning with Autoencoder Compression and Convergence Guarantee
  • Dec 4, 2022
  • Mahdi Beitollahi + 1 more

Federated Learning (FL) is considered the key approach for privacy-preserving, distributed machine learning (ML) systems. However, due to the transmission of large ML models from users to the server in each iteration of FL, communication on resource-constrained networks is currently a fundamental bottleneck in FL, restricting the ML model complex-ity and user participation. One of the notable trends to reduce the communication cost of FL systems is gradient compression, in which techniques in the form of sparsification or quantization are utilized. However, these methods are pre-fixed and do not capture the redundant, correlated information across parameters of the ML models, user devices' data, and iterations of FL. Further, these methods do not fully take advantage of the error-correcting capability of the FL process. In this paper, we propose the Federated Learning with Autoencoder Compression (FLAC) approach that utilizes the redundant information and error-correcting capability of FL to compress user devices' models for uplink transmission. FLAC trains an autoencoder to encode and decode users' models at the server in the Training State, and then, sends the autoencoder to user devices for compressing local models for future iterations during the Compression State. To guarantee the convergence of the FL, FLAC dynamically controls the autoencoder error by switching between the Training State and Compression State to adjust its autoencoder and its compression rate based on the error tolerance of the FL system. We theoretically prove that FLAC converges for FL systems with strongly convex ML models and non-i.i.d. data distribution. Our extensive experimental results'over three datasets with different network architectures show that FLAC can achieve compression rates ranging from 83x to 875x while staying near 7 percent of the accuracy of the non-compressed FL systems.

  • Research Article
  • 10.14313/jamris/3-2024/18
Gradient Scale Monitoring for Federated Learning Systems
  • Sep 3, 2024
  • Journal of Automation, Mobile Robotics and Intelligent Systems
  • Karolina Bogacka + 2 more

As the computational and communicational capabilities of edge and IoT devices grow, so do the opportunities for novel Machine Learning solutions. This leads to an increase in popularity of Federated Learning (FL), especially in cross-device settings. However, while there is a multitude of ongoing research works analyzing various aspects of the FL process, most of them do not focus on issues of operationalization and monitoring. For instance, there is a noticeable lack of research in the topic of effective problem diagnosis in FL systems. This work begins with a case study, in which we have intended to compare the performance of four selected approaches to the topology of FL systems. For this purpose, we have constructed and executed simulations of their training process in a controlled environment. We have analyzed the obtained results and encountered concerning periodic drops in the accuracy for some of the scenarios. We have performed a successful reexamination of the experiments, which led us to diagnose the problem as caused by exploding gradients. In view of those findings, we have formulated a potential new method for the continuous monitoring of the FL training process. The method would hinge on regular local computation of a handpicked metric - the gradient scale coefficient (GSC). We then extend our prior research to include a preliminary analysis of the effectiveness of GSC and average gradients per layer as potentially suitable for FL diagnostics metrics. In order to perform a more thorough examination of their usefulness in different FL scenarios, we simulate the occurrence of the exploding gradient problem, vanishing gradient problem and stable gradient serving as a baseline. We then evaluate the resulting visualizations based on their clarity and computational requirements. We introduce a gradient monitoring suite for the FL training process based on our results.

  • Conference Article
  • Cite Count Icon 7
  • 10.1109/bigdata55660.2022.10021037
FedLesScan: Mitigating Stragglers in Serverless Federated Learning
  • Dec 17, 2022
  • Mohamed Elzohairy + 6 more

Federated Learning (FL) is a machine learning paradigm that enables the training of a shared global model across distributed clients while keeping the training data local. While most prior work on designing systems for FL has focused on using stateful always running components, recent work has shown that components in an FL system can greatly benefit from the usage of serverless computing and Function-as-a-Service technologies. To this end, distributed training of models with serverless FL systems can be more resource-efficient and cheaper than conventional FL systems. However, serverless FL systems still suffer from the presence of stragglers, i.e., slow clients due to their resource and statistical heterogeneity. While several strategies have been proposed for mitigating stragglers in FL, most methodologies do not account for the particular characteristics of serverless environments, i.e., cold-starts, performance variations, and the ephemeral stateless nature of the function instances. Towards this, we propose FedLesScan, a novel clustering-based semi-asynchronous training strategy, specifically tailored for serverless FL. FedLesScan dynamically adapts to the behaviour of clients and minimizes the effect of stragglers on the overall system. We implement our strategy by extending an open-source serverless FL system called FedLess. Moreover, we comprehensively evaluate our strategy using the 2nd generation Google Cloud Functions with four datasets and varying percentages of stragglers. Results from our experiments show that compared to other approaches FedLesScan reduces training time and cost by an average of 8% and 20% respectively while utilizing clients better with an average increase in the effective update ratio of 17.75%.

  • Research Article
  • Cite Count Icon 5
  • 10.1016/j.csbj.2025.06.009
Revolutionizing healthcare data analytics with federated learning: A comprehensive survey of applications, systems, and future directions.
  • Jan 1, 2025
  • Computational and structural biotechnology journal
  • Nisha Thorakkattu Madathil + 4 more

Revolutionizing healthcare data analytics with federated learning: A comprehensive survey of applications, systems, and future directions.

  • Research Article
  • Cite Count Icon 3
  • 10.1109/tdsc.2024.3472869
SHIELD - Secure Aggregation Against Poisoning in Hierarchical Federated Learning
  • Mar 1, 2025
  • IEEE Transactions on Dependable and Secure Computing
  • Yushan Siriwardhana + 4 more

Federated Learning (FL) is a privacy-preserving distributed Machine Learning (ML) technique. Hierarchical FL is a novel variant of FL applicable to networks with multiple layers. Instead of transmitting client models to the server, hierarchical FL performs aggregations in the layers between the devices and the server. This further reduces the traffic toward the higher layers, which helps efficient link utilization. An adversary can manipulate a set of clients and send malicious model updates toward upper layers to create a trained model with a malicious objective. These attacks, also known as poisoning attacks, disrupt the model training. Like FL, Hierarchical FL is also vulnerable to poisoning attacks since the aggregators do not possess raw data. The existing robust algorithms are designed for FL systems with n clients and a server. Therefore, they are not effective against poisoning attacks in hierarchical FL systems. This paper proposes SHIELD, a novel robust aggregation technique that defends hierarchical FL systems from poisoning attacks. We evaluate SHIELD with several datasets in different application areas with different attack strategies and data distributions. The evaluation results demonstrate that SHIELD effectively defends hierarchical FL systems from poisoning attacks with a negligible impact on the benign performance of the models.

  • Supplementary Content
  • Cite Count Icon 59
  • 10.2196/41588
Federated Machine Learning, Privacy-Enhancing Technologies, and Data Protection Laws in Medical Research: Scoping Review
  • Mar 30, 2023
  • Journal of Medical Internet Research
  • Alissa Brauneck + 7 more

BackgroundThe collection, storage, and analysis of large data sets are relevant in many sectors. Especially in the medical field, the processing of patient data promises great progress in personalized health care. However, it is strictly regulated, such as by the General Data Protection Regulation (GDPR). These regulations mandate strict data security and data protection and, thus, create major challenges for collecting and using large data sets. Technologies such as federated learning (FL), especially paired with differential privacy (DP) and secure multiparty computation (SMPC), aim to solve these challenges.ObjectiveThis scoping review aimed to summarize the current discussion on the legal questions and concerns related to FL systems in medical research. We were particularly interested in whether and to what extent FL applications and training processes are compliant with the GDPR data protection law and whether the use of the aforementioned privacy-enhancing technologies (DP and SMPC) affects this legal compliance. We placed special emphasis on the consequences for medical research and development.MethodsWe performed a scoping review according to the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews). We reviewed articles on Beck-Online, SSRN, ScienceDirect, arXiv, and Google Scholar published in German or English between 2016 and 2022. We examined 4 questions: whether local and global models are “personal data” as per the GDPR; what the “roles” as defined by the GDPR of various parties in FL are; who controls the data at various stages of the training process; and how, if at all, the use of privacy-enhancing technologies affects these findings.ResultsWe identified and summarized the findings of 56 relevant publications on FL. Local and likely also global models constitute personal data according to the GDPR. FL strengthens data protection but is still vulnerable to a number of attacks and the possibility of data leakage. These concerns can be successfully addressed through the privacy-enhancing technologies SMPC and DP.ConclusionsCombining FL with SMPC and DP is necessary to fulfill the legal data protection requirements (GDPR) in medical research dealing with personal data. Even though some technical and legal challenges remain, for example, the possibility of successful attacks on the system, combining FL with SMPC and DP creates enough security to satisfy the legal requirements of the GDPR. This combination thereby provides an attractive technical solution for health institutions willing to collaborate without exposing their data to risk. From a legal perspective, the combination provides enough built-in security measures to satisfy data protection requirements, and from a technical perspective, the combination provides secure systems with comparable performance with centralized machine learning applications.

  • Conference Article
  • Cite Count Icon 29
  • 10.1109/bigdata55660.2022.10020431
Federated Learning Attacks and Defenses: A Survey
  • Dec 17, 2022
  • Yao Chen + 4 more

In terms of artificial intelligence, there are several security and privacy deficiencies in the traditional centralized training methods of machine learning models by a server. To address this limitation, federated learning (FL) has been proposed and is known for breaking down "data silos" and protecting the privacy of users. However, FL has not yet gained popularity in the industry, mainly due to its security, privacy, and high cost of communication. For the purpose of advancing the research in this field, building a robust FL system, and realizing the wide application of FL, this paper sorts out the possible attacks and corresponding defenses of the current FL system systematically. Firstly, this paper briefly introduces the basic workflow of FL and related knowledge of attacks and defenses. It reviews a great deal of research about privacy theft and malicious attacks that have been studied in recent years. Most importantly, in view of the current three classification criteria, namely the three stages of machine learning, the three different roles in federated learning, and the CIA (Confidentiality, Integrity, and Availability) guidelines on privacy protection, we divide attack approaches into two categories according to the training stage and the prediction stage in machine learning. Furthermore, we also identify the CIA property violated for each attack method and potential attack role. Various defense mechanisms are then analyzed separately from the level of privacy and security. Finally, we summarize the possible challenges in the application of FL from the aspect of attacks and defenses and discuss the future development direction of FL systems. In this way, the designed FL system has the ability to resist different attacks and is more secure and stable.

  • Research Article
  • Cite Count Icon 91
  • 10.1109/tnnls.2021.3105810
Toward On-Device Federated Learning: A Direct Acyclic Graph-Based Blockchain Approach.
  • Apr 1, 2023
  • IEEE Transactions on Neural Networks and Learning Systems
  • Mingrui Cao + 2 more

Due to the distributed characteristics of federated learning (FL), the vulnerability of the global model and the coordination of devices are the main obstacle. As a promising solution of decentralization, scalability, and security, leveraging the blockchain in FL has attracted much attention in recent years. However, the traditional consensus mechanisms designed for blockchain-like proof of work (PoW) would cause extreme resource consumption, which reduces the efficiency of FL greatly, especially when the participating devices are wireless and resource-limited. In order to address device asynchrony and anomaly detection in FL while avoiding the extra resource consumption caused by blockchain, this article introduces a framework for empowering FL using direct acyclic graph (DAG)-based blockchain systematically (DAG-FL). Accordingly, DAG-FL is first introduced from a three-layer architecture in detail, and then, two algorithms DAG-FL Controlling and DAG-FL Updating are designed running on different nodes to elaborate the operation of the DAG-FL consensus mechanism. After that, a Poisson process model is formulated to discuss that how to set deployment parameters to maintain DAG-FL stably in different FL tasks. The extensive simulations and experiments show that DAG-FL can achieve better performance in terms of training efficiency and model accuracy compared with the typical existing on-device FL systems as the benchmarks.

  • Conference Article
  • Cite Count Icon 30
  • 10.1109/wcnc49053.2021.9417299
ChainsFL: Blockchain-driven Federated Learning from Design to Realization
  • Mar 29, 2021
  • Shuo Yuan + 3 more

Despite the advantages of Federated Learning (FL), such as devolving model training to intelligent devices and preserving data privacy, FL still faces the risk of the single point of failure and attack from malicious participants. Recently, blockchain is considered a promising solution that can transform FL training into a decentralized manner and improve security during training. However, traditional consensus mechanisms and architecture for blockchain can hardly handle the large-scale FL task due to the huge resource consumption, limited throughput, and high communication complexity. To this end, this paper proposes a two-layer blockchain-driven FL framework, called as ChainsFL, which is composed of multiple Raft-based shard networks (layer-l) and a Direct Acyclic Graph (DAG)-based main chain (layer-2) where layer-l limits the scale of each shard for a small range of information exchange, and layer-2 allows each shard to update and share the model in parallel and asynchronously. Furthermore, FL procedure in a blockchain manner is designed, and the refined DAG consensus mechanism to mitigate the effect of stale models is proposed. In order to provide a proof-of-concept implementation and evaluation, the shard blockchain base on Hyperledger Fabric is deployed on the self-made gateway as layer-l, and the self-developed DAG-based main chain is deployed on the personal computer as layer-2. The experimental results show that ChainsFL provides acceptable and sometimes better training efficiency and stronger robustness comparing with the typical existing FL systems.

  • Research Article
  • 10.56651/lqdtu.jst.v13.n02.925.ict
MITIGATING POISONING ATTACKS TO FEDERATED LEARNING IN IOTs ANOMALY DETECTION WITH ATTENTION AGGREGATION
  • Dec 31, 2024
  • Journal of Science and Technique
  • Ly Vu

Federated Learning (FL) is a privacy-preserving approach for training deep neural networks across decentralized devices without sharing raw data. Thus, FL has been popularly applied in domains like anomaly detection in Internet of Things (IoTs). However, IoT networks/ devices have limited protection capabilities, resulting in the vulnerability of FL to data poisoning attacks. In order to address this challenge, we propose a new robust FL system designed to counter data poisoning attacks. Our approach, named as Federated Learning with Attention Aggregation (FedAA), leverages AutoEncoder (AE) models for local anomaly detection in IoT networks. In FedAA, we design to aggregate the global model from local models using a novel aggregation method, named as Attention Aggregation (AA). This method is specifically designed to mitigate the impact of data poisoning attacks, which often lead to high values of the loss functions in the local models. More precisely, the local models with high loss values are assigned lower attention weights when contributing to the global model aggregation, and vice versa. As a result, the proposed AA method enhances the robustness of FedAA against data poisoning attacks. We have conducted extensive experiments on three datasets, i.e., N-BaIoT, NSL-KDD, and UNSW, of IoT anomaly detection. The results show that FedAA is more robust than other FL systems in mitigating data poisoning attacks.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 34
  • 10.3390/electronics11101624
Blockchain-Enabled: Multi-Layered Security Federated Learning Platform for Preserving Data Privacy
  • May 19, 2022
  • Electronics
  • Zeba Mahmood + 1 more

Privacy and data security have become the new hot topic for regulators in recent years. As a result, Federated Learning (FL) (also called collaborative learning) has emerged as a new training paradigm that allows multiple, geographically distributed nodes to learn a Deep Learning (DL) model together without sharing their data. Blockchain is becoming a new trend as data protection and privacy are concerns in many sectors. Technology is leading the world and transforming into a global village where everything is accessible and transparent. We have presented a blockchain enabled security model using FL that can generate an enhanced DL model without sharing data and improve privacy through higher security and access rights to data. However, existing FL approaches also have unique security vulnerabilities that malicious actors can exploit and compromise the trained model. The FL method is compared to the other known approaches. Users are more likely to choose the latter option, i.e., providing local but private data to the server and using ML apps, performing ML operations on the devices without benefiting from other users’ data, and preventing direct access to raw data and local training of ML models. FL protects data privacy and reduces data transfer overhead by storing raw data on devices and combining locally computed model updates. We have investigated the feasibility of data and model poisoning attacks under a blockchain-enabled FL system built alongside the Ethereum network and the traditional FL system (without blockchain). This work fills a knowledge gap by proposing a transparent incentive mechanism that can encourage good behavior among participating decentralized nodes and avoid common problems and provides knowledge for the FL security literature by investigating current FL systems.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/services51467.2021.00034
Modeling and Performance Analysis on Federated Learning in Edge Computing
  • Sep 1, 2021
  • Qiang Duan + 1 more

Federated Learning (FL) deployed in edge computing may achieve some advantages such as private data protection, communication cost reduction, and lower training latency compared to cloud-centric training approaches. The Anything-as-a-Service (XaaS) paradigm, as the main service provisioning model in edge computing, enables various flexible FL deployments. On the other hand, the distributed nature of FL together with the highly diverse computing and networking infrastructures in an edge environment introduce extra latency that may degrade FL performance. Therefore, delay performance evaluation on edge-based FL systems becomes an important research topic. However, XaaS-based FL deployment brings new challenges to performance analysis that cannot be well addressed by conventional analytical approaches. In this paper, we attempt to address such challenges by proposing a profile-based modeling and analysis method for evaluating delay performance of edge-based FL systems. The insights obtained from the modeling and analysis may offer useful guidelines to various aspects of FL design. Application of network calculus techniques makes the proposed method general and flexible, thus may be applied to FL systems deployed upon the heterogeneous edge infrastructures.

  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.comcom.2024.07.014
Eco-FL: Enhancing Federated Learning sustainability in edge computing through energy-efficient client selection
  • Jul 22, 2024
  • Computer Communications
  • Martina Savoia + 3 more

In the realm of edge cloud computing (ECC), Federated Learning (FL) revolutionizes the decentralization of machine learning (ML) models by enabling their training across multiple devices. In this way, FL preserves privacy and minimizes the need for centralized data by processing data near the source. From a communication standpoint, only the model weights are exchanged between devices. By avoiding the need to send data to a centralized location for processing, FL reduces the energy required for data transfer and supports more efficient use of computing resources at the edge. FL is particularly advantageous for resource-constrained devices, such as smartphones and IoT devices. However, this limited computational power and battery capacity and the challenge of energy consumption are critical aspects of FL systems. This paper introduces Eco-FL, an innovative methodology designed to optimize energy consumption in FL systems, in the field of Green Edge Cloud Computing (GECC). Our approach employs a device selection process that considers the entropy of the data held by the devices and their available energy reserves. This ensures that devices with lower energy availability are less likely to participate in the training rounds, prioritizing those with higher energy capacities. To evaluate the efficacy of our methodology, we utilize FedEntropy, an entropy-based aggregation method, alongside established aggregation methods such as FedAvg and FedProx for performance comparison. The effectiveness of Eco-FL in reducing energy consumption without compromising the accuracy of the FL process is demonstrated through analyses conducted on three distinct datasets. These analyses vary the β parameter of the Dirichlet distribution and account for scenarios with both homogeneous and heterogeneous initial device charges. Our findings validate Eco-FL’s potential to enhance the sustainability of FL systems by judiciously managing client participation based on energy criteria, presenting a significant step forward in the development of energy-efficient FL.

  • Conference Article
  • Cite Count Icon 1
  • 10.1145/3578356.3592576
Towards Robust and Bias-free Federated Learning
  • May 8, 2023
  • Ousmane Touat + 1 more

Federated learning (FL) is an exciting machine learning approach where multiple devices collaboratively train a model without sharing their raw data. The FL system is vulnerable to the action of Byzantine clients sending arbitrary model updates, and the trained model may exhibit prediction bias towards specific groups. However, FL mechanisms tackling robustness and bias mitigation have contradicting objectives, motivating the question of building a FL system that comprehensively combines both objectives. In this paper, we first survey state-of-the-art approaches to robustness to Byzantine behavior and bias mitigation and analyze their respective objectives. Then, we conduct an empirical evaluation to illustrate the interplay between state-of-the-art FL robustness mechanisms and FL bias mitigation mechanisms. Specifically, we show that classical robust FL methods may inadvertently filter out benign FL clients that have statistically rare data, particularly for minority groups. Finally, we derive research directions for building more robust and bias-free FL systems.

  • Research Article
  • Cite Count Icon 11
  • 10.1016/j.neunet.2023.06.010
Contrastive encoder pre-training-based clustered federated learning for heterogeneous data
  • Jun 10, 2023
  • Neural Networks
  • Ye Lin Tun + 4 more

Contrastive encoder pre-training-based clustered federated learning for heterogeneous data

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.