Distributed Learning Research Articles

Background and objectiveFederated learning (FL) is an emerging distributed learning framework allowing multiple clients (hospitals, institutions, smart devices, etc.) to collaboratively train a centralized machine learning model without disclosing personal data. It has the potential to address several healthcare challenges, including a lack of training data, data privacy, and security concerns. However, model learning under FL is affected by non-i.i.d. data, leading to severe model divergence and reduced performance due to the varying client's data distributions. To address this problem, we propose FedDSS, Federated Data Similarity Selection, a framework that uses a data-similarity approach to select clients, without compromising client data privacy. MethodsFedDSS comprises a statistical-based data similarity metric, a N-similar-neighbor network, and a network-based selection strategy. We assessed FedDSS' performance against FedAvg's in i.i.d. and non-i.i.d. settings with two public pediatric sepsis datasets (PICD and MIMICIII). Selection fairness was measured using entropy. Simulations were repeated five times to evaluate average loss, true positive rate (TPR), and entropy. ResultsIn i.i.d setting on PICD, FedDSS achieved a higher TPR starting from the 9th round and surpassing 0.6 three rounds earlier than FedAvg. On MIMICIII, FedDSS's loss decreases significantly from the 13th round, with TPR > 0.8 by the 2nd round, two rounds ahead of FedAvg (at the 4th round). In the non-i.i.d. setting, FedDSS achieved TPR > 0.7 by the 4th and > 0.8 by the 7th round, earlier than FedAvg (at the 5th and 11th rounds). In both settings, FedDSS showed reasonable fairness (entropy of 2.2 and 2.1). ConclusionWe demonstrated that FedDSS contributes to improved learning in FL by achieving faster convergence, reaching the desired TPR with fewer communication rounds, and potentially enhancing sepsis prediction (TPR) over FedAvg.

Distributed Predictive Analytics (DPA) refers to constructing predictive models based on data distributed across nodes. DPA reduces the need for data centralization, thus, alleviating concerns about data privacy, decreasing the load on central servers, and minimizing communication overhead. However, data collected by nodes are inherently different; each node can have different distributions, volumes, access patterns, and features space. This heterogeneity hinders the development of accurate models in a distributed fashion. Many state-of-the-art methods adopt random node selection as a straightforward approach. Such method is particularly ineffective when dealing with data and access pattern heterogeneity, as it increases the likelihood of selecting nodes with low-quality or irrelevant data for DPA. Consequently, it is only after training models over randomly selected nodes that the most suitable ones can be identified based on the predictive performance. This results in more time and resource consumption, and increased network load. In this work, holistic knowledge of nodes’ data characteristics and access patterns is crucial. Such knowledge enables the successful selection of a subset of suitable nodes for each DPA task (query) before model training. Our method engages the most suitable nodes by predicting their relevant distributed data and learning predictive models per query. We introduce a novel DPA query-centric mechanism for node and relevant data selection. We contribute with (i) predictive selection mechanisms based on the availability and relevance of data per DPA query and (ii) various distributed machine learning mechanisms that engage the most suitable nodes for model training. We evaluate the efficiency of our mechanism and provide a comparative assessment with other methods found in the literature. Our experiments showcase that our mechanism significantly outperforms other approaches being applicable in DPA.

Distributed Learning Research Articles

Related Topics

Articles published on Distributed Learning

RNEP: Random Node Entropy Pairing for Efficient Decentralized Training with Non-IID Local Data

EXPRESS: Individual differences in distributional statistical learning: better frequency 'discriminators' are better 'estimators'.

Backdoor Attacks in Peer-to-Peer Federated Learning

A place to call our home: innovative rural physical therapy training in Canada

Privacy-preserving distributed learning: collaborative training on principal components and orthogonal projections of datapoints

Distributional learning drives statistical suppression and deafening.

Efficient Privacy-Preserving Machine Learning with Lightweight Trusted Hardware

Joint antenna selection and resource allocation for mm‐wave directional D2D communications using distributed deep reinforcement learning

FedDSS: A data-similarity approach for client selection in horizontal federated learning

FedDA: Resource-adaptive federated learning with dual-alignment aggregation optimization for heterogeneous edge devices

Federated Incremental Learning algorithm based on Topological Data Analysis

Noise-resistant fuzzy multineighbourhood rough set-based feature selection with label enhancement and its application for multilabel classification

The influence of e-learning on exam performance and the role of achievement goals in shaping learning patterns

A personality-guided preference aggregator for ephemeral group recommendation

Node and relevant data selection in distributed predictive analytics: A query-centric approach

SS-ALDL: Consistency-based semi-supervised label distribution learning for acne severity classification

VCSA: Verifiable and collusion-resistant secure aggregation for federated learning using symmetric homomorphic encryption

Color distribution learning modulates saccade endpoints: a study of the global effect

Mobility-aware federated self-supervised learning in vehicular network

Residual [formula omitted]-Nearest Neighbors Label Distribution Learning

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Distributed Learning Research Articles

Related Topics

Articles published on Distributed Learning

RNEP: Random Node Entropy Pairing for Efficient Decentralized Training with Non-IID Local Data

EXPRESS: Individual differences in distributional statistical learning: better frequency 'discriminators' are better 'estimators'.

Backdoor Attacks in Peer-to-Peer Federated Learning

A place to call our home: innovative rural physical therapy training in Canada

Privacy-preserving distributed learning: collaborative training on principal components and orthogonal projections of datapoints

Distributional learning drives statistical suppression and deafening.

Efficient Privacy-Preserving Machine Learning with Lightweight Trusted Hardware

Joint antenna selection and resource allocation for mm‐wave directional D2D communications using distributed deep reinforcement learning

FedDSS: A data-similarity approach for client selection in horizontal federated learning

FedDA: Resource-adaptive federated learning with dual-alignment aggregation optimization for heterogeneous edge devices

Federated Incremental Learning algorithm based on Topological Data Analysis

Noise-resistant fuzzy multineighbourhood rough set-based feature selection with label enhancement and its application for multilabel classification

The influence of e-learning on exam performance and the role of achievement goals in shaping learning patterns

A personality-guided preference aggregator for ephemeral group recommendation

Node and relevant data selection in distributed predictive analytics: A query-centric approach

SS-ALDL: Consistency-based semi-supervised label distribution learning for acne severity classification

VCSA: Verifiable and collusion-resistant secure aggregation for federated learning using symmetric homomorphic encryption

Color distribution learning modulates saccade endpoints: a study of the global effect

Mobility-aware federated self-supervised learning in vehicular network

Residual [formula omitted]-Nearest Neighbors Label Distribution Learning