A Privacy‐Preserving Threat Intelligence Model for Secure Healthcare Data Sharing in the Cloud
ABSTRACT In the contemporary healthcare landscape, secure and efficient data sharing is paramount, especially when utilizing cloud‐based platforms. The advent of cloud computing has revolutionized healthcare data sharing, offering unparalleled accessibility and scalability. However, the inherent risks associated with data breaches and privacy violations pose significant challenges, necessitating robust security measures. In such scenarios, the integration of threat intelligence with privacy‐preserving techniques becomes imperative to safeguard sensitive healthcare information. This research introduces a novel algorithm, FedGANet, alongside an integrated Privacy‐Preserving Threat Intelligence Model (FedGAN‐PPTIM), developed to strengthen secure healthcare data exchange within cloud and IoMT environments. FedGANet enhances traditional security paradigms by jointly leveraging Generative Adversarial Networks (GANs) to synthesize realistic threat scenarios and Federated Learning (FL) to enable decentralized model training without exposing sensitive patient data. The model further aligns with interoperability considerations, supporting seamless integration into diverse clinical ecosystems. The proposed FedGAN‐PPTIM framework is extensively compared with established privacy‐preserving and threat intelligence approaches across multiple evaluation metrics, including privacy leakage, threat detection rate, false positive rate, and communication overhead. The simulation analysis demonstrates that FedGANet outperforms existing methods, significantly reducing privacy leakage and communication overhead while maintaining high threat detection rates and low false positive rates. These results underscore the efficacy of FedGANet in addressing privacy and security challenges in healthcare data sharing. This article is categorized under: Technologies > Cloud Computing Technologies > Artificial Intelligence Commercial, Legal, and Ethical Issues > Security and Privacy
- Research Article
5
- 10.7717/peerj-cs.2751
- Mar 28, 2025
- PeerJ. Computer science
Intrusion detection in Internet of Things (IoT)-based wireless sensor networks (WSNs) is essential due to their widespread use and inherent vulnerability to security breaches. Traditional centralized intrusion detection systems (IDS) face significant challenges in data privacy, computational efficiency, and scalability, particularly in resource-constrained IoT environments. This study aims to create and assess a federated learning (FL) framework that integrates with long short-term memory (LSTM) networks for efficient intrusion detection in IoT-based WSNs. We design the framework to enhance detection accuracy, minimize false positive rates (FPR), and ensure data privacy, while maintaining system scalability. Using an FL approach, multiple IoT nodes collaboratively train a global LSTM model without exchanging raw data, thereby addressing privacy concerns and improving detection capabilities. The proposed model was tested on three widely used datasets: WSN-DS, CIC-IDS-2017, and UNSW-NB15. The evaluation metrics for its performance included accuracy, F1 score, FPR, and root mean square error (RMSE). We evaluated the performance of the FL-based LSTM model against traditional centralized models, finding significant improvements in intrusion detection. The FL-based LSTM model achieved higher accuracy and a lower FPR across all datasets than centralized models. It effectively managed sequential data in WSNs, ensuring data privacy while maintaining competitive performance, particularly in complex attack scenarios. FL and LSTM networks work well together to make a strong way to find intrusions in IoT-based WSNs, which improves both privacy and detection. This study underscores the potential of FL-based systems to address key challenges in IoT security, including data privacy, scalability, and performance, making the proposed framework suitable for real-world IoT applications.
- Research Article
- 10.30574/wjarr.2025.27.1.2541
- Jul 30, 2025
- World Journal of Advanced Research and Reviews
The escalating complexity and frequency of malware attacks pose a significant challenge to conventional cybersecurity frameworks, particularly in scenarios demanding high data privacy and cross-organizational threat intelligence sharing. Traditional centralized machine learning models for malware detection often rely on aggregating data in a central server, thereby increasing the risk of data breaches and limiting the deployment of models in privacy-sensitive environments such as healthcare, finance, and critical infrastructure. To address these limitations, this study explores an integrated approach that combines Federated Learning (FL) with Explainable Artificial Intelligence (XAI) for enhancing malware detection while preserving user privacy and system confidentiality. Federated learning enables the collaborative training of robust malware classifiers across multiple decentralized nodes without sharing raw data, thus maintaining local data sovereignty and complying with data protection regulations. The proposed framework incorporates deep learning architectures such as convolutional neural networks (CNNs) trained in a federated environment using feature vectors extracted from malicious binaries and behavior logs. To ensure transparency and trust in model predictions, explainable AI techniques specifically SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are integrated, providing actionable insights into the model’s decision-making process. This study also presents a comprehensive evaluation using a benchmark malware dataset distributed across simulated client environments, measuring detection accuracy, communication overhead, privacy leakage, and interpretability performance. Results demonstrate that the FL-XAI approach achieves detection rates comparable to centralized models while ensuring data confidentiality and interpretability. The research contributes to the evolving field of privacy-preserving threat intelligence by offering a scalable and explainable framework suitable for real-time cybersecurity applications.
- Research Article
14
- 10.1109/jiot.2023.3288936
- Jan 1, 2024
- IEEE Internet of Things Journal
Federated Learning (FL) as a secure distributed learning framework gains interests in Internet of Things (IoT) due to its capability of protecting the privacy of participant data. However, traditional FL systems are vulnerable to Free-Rider (FR) attacks, which causes unfairness, privacy leakage and inferior performance to FL systems. The prior defense mechanisms against FR attacks assumed that malicious clients (namely, adversaries) declare less than 50% of the total amount of clients. Moreover, they aimed for Anonymous FR (AFR) attacks and lost effectiveness in resisting Selfish FR (SFR) attacks. In this paper, we propose a Parameter Audit-based Secure and fair federated learning Scheme (PASS) against FR attack. PASS has the following key features: (a) prevent from privacy leakage with less accuracy loss; (b) be effective in countering both AFR and SFR attacks; (c) work well no matter whether AFR and SFR adversaries occupy the majority of clients or not. Extensive experimental results validate that PASS: (a) has the same level as the State-Of-The-Art method in mean square error against privacy leakage; (b) defends against AFR and SFR attacks in terms of a higher defense success rate, lower false positive rate, and higher F1-score; (c) is still effective where adversaries exceed 50%, with F1-score 89% against AFR attack and F1-score 87% against SFR attack. Note that PASS produces no negative effect on FL accuracy when there is no FR adversary.
- Research Article
43
- 10.1109/access.2020.2979323
- Jan 1, 2020
- IEEE Access
Emerging mobile edge techniques and applications such as Augmented Reality (AR)/Virtual Reality (VR), Internet of Things (IoT), and vehicle networking, result in an explosive growth of power and computing resource consumptions. In the meantime, the volume of data generated at the edge networks is also increasing rapidly. Under this circumstance, building energy-efficient and privacy-protected communications is imperative for 5G and beyond wireless communication systems. The recent emerging distributed learning methods such as federated learning (FL) perform well in improving resource efficiency while protecting user privacy with low communication overhead. Specifically, FL enables edge devices to learn a shared network model by aggregating local updates while keeping all the training processes on local devices. This paper investigates distributed power allocation for edge users in decentralized wireless networks with aim to maximize energy/spectrum efficiency while preventing privacy leakage based on a FL framework. Due to the dynamics and complexity of wireless networks, we adopt an on-line Actor-Critic (AC) architecture as the local training model, and FL performs cooperation for edge users by sharing the gradients and weightages generated in the Actor network. Moreover, in order to resolve the over-fitting problem caused by data leakages in Non-independent and identically distributed (Non-i.i.d) data environment, we propose a federated augmentation mechanism with Wasserstein Generative Adversarial Networks (WGANs) algorithm for data augmentation. Federated augmentation empowers each device to replenish the data buffer using a generative model of WGANs until accomplishing an i.i.d training dataset, which significantly reduces the communication overhead in distributed learning compared to direct data sample exchange method. Numerical results reveal that the proposed federated learning based cooperation and augmentation (FL-CA) algorithm possesses a good convergence property, high robustness and achieves better accuracy of power allocation strategy than other three benchmark algorithms.
- Research Article
5
- 10.1016/j.jisa.2022.103309
- Sep 1, 2022
- Journal of Information Security and Applications
High-accuracy low-cost privacy-preserving federated learning in IoT systems via adaptive perturbation
- Research Article
153
- 10.1109/tpds.2021.3090331
- Jan 1, 2022
- IEEE Transactions on Parallel and Distributed Systems
While petabytes of data are generated each day by a number of independent computing devices, only a few of them can be finally collected and used for deep learning (DL) due to the apprehension of data security and privacy leakage, thus seriously retarding the extension of DL. In such a circumstance, federated learning (FL) was proposed to perform model training by multiple clients' combined data without the dataset sharing within the cluster. Nevertheless, federated learning with periodic model averaging (FedAvg) introduced massive communication overhead as the synchronized data in each iteration is about the same size as the model, and thereby leading to a low communication efficiency. Consequently, variant proposals focusing on the communication rounds reduction and data compression were proposed to decrease the communication overhead of FL. In this article, we propose Overlap-FedAvg, an innovative framework that loosed the chain-like constraint of federated learning and paralleled the model training phase with the model communication phase (i.e., uploading local models and downloading the global model), so that the latter phase could be totally covered by the former phase. Compared to vanilla FedAvg, Overlap-FedAvg was further developed with a hierarchical computing strategy, a data compensation mechanism, and a nesterov accelerated gradients (NAG) algorithm. In Particular, Overlap-FedAvg is orthogonal to many other compression methods so that they could be applied together to maximize the utilization of the cluster. Besides, the theoretical analysis is provided to prove the convergence of the proposed framework. Extensive experiments conducting on both image classification and natural language processing tasks with multiple models and datasets also demonstrate that the proposed framework substantially reduced the communication overhead and boosted the federated learning process.
- Conference Article
72
- 10.1109/cvpr.2019.00257
- Jun 1, 2019
Generative adversarial networks (GANs) are a framework that learns a generative distribution through adversarial training. Recently, their class-conditional extensions (e.g., conditional GAN (cGAN) and auxiliary classifier GAN (AC-GAN)) have attracted much attention owing to their ability to learn the disentangled representations and to improve the training stability. However, their training requires the availability of large-scale accurate class-labeled data, which are often laborious or impractical to collect in a real-world scenario. To remedy this, we propose a novel family of GANs called label-noise robust GANs (rGANs), which, by incorporating a noise transition model, can learn a clean label conditional generative distribution even when training labels are noisy. In particular, we propose two variants: rAC-GAN, which is a bridging model between AC-GAN and the label-noise robust classification model, and rcGAN, which is an extension of cGAN and solves this problem with no reliance on any classifier. In addition to providing the theoretical background, we demonstrate the effectiveness of our models through extensive experiments using diverse GAN configurations, various noise settings, and multiple evaluation metrics (in which we tested 402 conditions in total). Our code is available at https://github.com/takuhirok/rGAN/.
- Book Chapter
- 10.71443/9789349552029-15
- Mar 4, 2025
The rapid evolution of cyber threats necessitates innovative approaches to enhance global cybersecurity collaboration. Federated Learning (FL) has emerged as a decentralized machine learning paradigm that enables distributed threat intelligence sharing while maintaining data privacy and security. This chapter explores the application of FL for large-scale cybersecurity networks, addressing critical challenges in scalability, security, and communication efficiency. The focus is on optimizing secure aggregation techniques to enable efficient and privacy-preserving model updates across heterogeneous and resource-constrained environments. Key solutions such as hierarchical aggregation, sparse model updates, and blockchain-based enhancements are discussed to mitigate the computational and communication overheads inherent in federated systems. the chapter investigates the integration of advanced cryptographic methods, including homomorphic encryption and differential privacy, to strengthen the security of federated networks against adversarial attacks. By leveraging FL’s potential, organizations can share threat intelligence across global networks without compromising sensitive data, significantly improving real-time cyber threat detection and response. The chapter concludes by identifying future research directions for overcoming existing challenges and further optimizing federated models in cybersecurity.
- Research Article
1
- 10.64235/ph519x51
- Apr 15, 2025
- Journal of Data Analysis and Critical Management
Considering the amount, level of sophistication, and variety of cyber threats, network security is required to be intelligent, real-time, and privacy-preserving. Although successful, traditional centralized machine learning models have a number of drawbacks such as the risk of privacy, data bottleneck, and single points of failure phenomena. Our proposal is the federated learning (FL) framework of distributed network security and threat intelligence with a plan to provide a solution that takes full advantage of the diversity of data distributed on heterogeneous nodes without imposing serious privacy risks to users. The framework allows distributed edge devices to jointly train deep learning models in a locally-distributed fashion and only exchanging model updates with an aggregator. We compare the performance of the system within the benchmark intrusion detection datasets in the presence of IID and non-IID data sets. The presented results show that the suggested FL-based framework maintains a reasonable level of detection accuracy, enables enormous failures in communication overhead, and creates increased privacy assurances as opposed to the conventional centralized methods. Moreover, the system possesses resistiveness to frequent adversarial attacks, e.g., data poisoning and model inversion. The work provides a scalable and flexible architecture of next-generation cybersecurity infrastructures, especially IoT, edge, and smart cities.
- Conference Article
10
- 10.1109/comcomap53641.2021.9653016
- Nov 26, 2021
Traditional machine learning (ML) algorithms need to collect a large mount of users' data for model training, which result in privacy leak and “data islands” problems emerge in endlessly. In order to solve the above problems, federated learning (FL) has emerged as an outstanding tool. FL is widely used for the sixth generation mobile network (6G) communications, artificial intelligence, and privacy-preserving applications. This article starts from the concept of FL, introduces the research status of FL algorithms and privacy-preserving technology, and further explains some of the current applications and future challenges. Although the FL has brought dawn, it still faces many challenges in terms of enhancing privacy-preserving and training model security. Communication overhead is a problem in the encryption process; the noise threshold of different scenarios needs to be solved in the process of noise handling; how to identify malicious attackers, and reduce malicious attacks is also a worth noticing challenge in modeling process.
- Research Article
17
- 10.1016/j.adhoc.2024.103677
- Oct 15, 2024
- Ad Hoc Networks
A distributed intrusion detection framework for vehicular Ad Hoc networks via federated learning and Blockchain
- Research Article
- 10.54660/ijfei.2024.1.2.48-52
- Jan 1, 2024
- International Journal of Future Engineering Innovations
The rapid advancement of Artificial Intelligence (AI) and Machine Learning (ML) has revolutionized healthcare by enabling predictive diagnostics, personalized treatment, and efficient resource management. However, integrating these technologies into real-world healthcare systems presents significant challenges, particularly concerning data privacy, security, and interoperability across institutions. Federated Learning (FL) has emerged as a promising solution, allowing decentralized model training across multiple healthcare providers without transferring sensitive patient data to a central server. This manuscript explores the integration of FL and privacy-preserving AI techniques within smart healthcare systems, offering a secure and collaborative framework for medical AI applications. We present a comprehensive review of current FL architectures adapted for healthcare, highlighting their potential in tasks such as disease prediction, medical imaging analysis, and patient monitoring. Furthermore, we examine privacy-preserving mechanisms—including differential privacy, secure multi-party computation, and homomorphic encryption—that fortify FL against data leakage and adversarial attacks. A comparative analysis of these approaches is conducted in terms of scalability, performance, and compliance with healthcare regulations such as HIPAA and GDPR. Additionally, we propose an enhanced FL framework tailored for heterogeneous healthcare environments, capable of addressing data imbalance, device constraints, and communication overhead. Through simulated experiments using benchmark medical datasets, we demonstrate that our framework maintains high model accuracy while significantly reducing privacy risks and computational burden. Our findings underline the transformative potential of federated learning and privacy-preserving AI in enabling secure, equitable, and intelligent healthcare delivery across institutions, paving the way for a new era of collaborative digital medicine.
- Research Article
- 10.3390/make7020043
- May 20, 2025
- Machine Learning and Knowledge Extraction
Deep learning models have an intrinsic privacy issue as they memorize parts of their training data, creating a privacy leakage. Membership inference attacks (MIAs) exploit this to obtain confidential information about the data used for training, aiming to steal information. They can be repurposed as a measurement of data integrity by inferring whether the data were used to train a machine learning model. While state-of-the-art attacks achieve significant privacy leakage, their requirements render them infeasible, hindering their use as practical tools to assess the magnitude of the privacy risk. Moreover, the most appropriate evaluation metric of MIA, the true positive rate at a low false positive rate, lacks interpretability. We claim that the incorporation of few-shot learning techniques into the MIA field and a suitable qualitative and quantitative privacy evaluation measure should resolve these issues. In this context, our proposal is twofold. We propose a few-shot learning-based MIA, termed the FeS-MIA model, which eases the evaluation of the privacy breach of a deep learning model by significantly reducing the number of resources required for this purpose. Furthermore, we propose an interpretable quantitative and qualitative measure of privacy, referred to as the Log-MIA measure. Jointly, these proposals provide new tools to assess privacy leakages and to ease the evaluation of the training data integrity of deep learning models, i.e., to analyze the privacy breach of a deep learning model. Experiments carried out with MIA over image classification and language modeling tasks, and a comparison to the state of the art, show that our proposals excel in identifying privacy leakages in a deep learning model with little extra information.
- Research Article
31
- 10.3389/fcomp.2020.00036
- Aug 28, 2020
- Frontiers in Computer Science
Data breaches and security incidents are becoming increasingly costly and statistics show that hackers are highly motivated to acquire confidential data as the financial benefits are substantial. Hence, business data has become a top priority to be compromised. Threat Intelligence has been recently introduced by organisations as a means to gain greater visibility of cyber threats, especially data breaches, in order to better protect their digital assets. A well-planned implementation of threat intelligence enables organisations to predict and (at least partially) prevent cyber crime, such as data breaches or data exfilteration ({\ie} attempts to move data outside an organization’s secure perimeters). This allows an organisation to better understand different aspects of threats, including identifying the adversary and how and why they intend to compromise digital assets, consequences of attacks, which assets can be compromised, to what level and how to detect threats, how to respond to them. A key enabler to implement threat intelligence is to build sophisticated data-driven architectures using machine learning that allows an organisation's cyber data (stored in different silos throughout an organisation's digital infrastructure) to be managed effectively. However, one of the biggest challenges of machine learning in cybersecurity is to enable an efficient implementation that scales in today's complex threat landscapes and digital infrastructure, respectively. In this paper, we review the data breaches problem and discuss the challenges of implementing machine learning to mitigate security threats and data intelligence to predict cyber threats that could potentially lead to data breaches leakage. Then illustrate how the future of effective threat intelligence is closely linked to efficiently applying machine learning approaches in this field, and outline future research directions in this area
- Conference Article
4
- 10.1109/bigdata52589.2021.9672039
- Dec 15, 2021
In Federated Learning there have been many op-timization methods that allow flexible local updating such as FedAvg that has become the de facto mechanism for averaging local stochastic gradient descent without sharing the data. Classic FL methods such as FedAvg struggle with trust and data leakage issues. In FedAvg and similar techniques, clients assume the aggregator server is a trusted but curious server. However, even if the server is trusted, the models still leak a lot of data through the weights. Several techniques have been proposed to reduce data leakage. One mechanism involves sharing pieces of the data with the server, but it violates the key privacy assumption of federated learning. Other solutions such as Federated Learning with Differential Privacy aim to reduce data leakage by adding noise to the weights/gradients. However, there is a trade-off between accuracy and the amount of noise added.In this paper, we propose a practical Federated Learning algorithm of deep neural networks on iterative model averaging we called FederatedTree. While FedAvg with differential privacy adds noise to the weights to provide a level of privacy, our algorithm applies a secure sequential averaging without adding noise to the models. FederatedTree solves the trust issue between client-to-client, client-to-server (if exists) and reduces the amount of data leakage without adding noise that lowers the model accuracy. The results show that the FederatedTree algorithm provides a high privacy rate with higher accuracy on popular datasets: MNIST, Fashion MNIST, CIFAR-10. Furthermore, FederatedTree utilizes a binary tree structure to reduce the sequential averaging time and remove the overhead of the excessive communication between the server and the clients.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.