Abstract

Federated Learning relies heavily on the data available at the worker nodes. In a majority of practical use-cases, the data at the worker nodes are Non-IID (non-independent-and-identically-distributed) in nature. The central server often only selects a subset of the worker nodes to compute the global model. If the subset has more nodes with heterogeneous data, the overall performance would be compromised. Selecting an optimal subset of worker nodes from the available worker nodes is crucial for improving the accuracy and convergence time of Federated Learning models. In this paper, we analyse the impact of Non-IID data, measured using quantity skewness, on the performance of the federated learning model and design approaches to detect the level of skewness of the data distribution at the worker nodes. Following this, we design an approach Federated Node Selection with Entropy (FedNSE) to select the optimal set of worker nodes to improve the accuracy and convergence time of a federated learning model. Our experiments performed in a simulated Federated Learning environment show that using FedNSE that selects the optimal subset of worker nodes improves the convergence time and accuracy compared to the existing approaches with a minimum decrease of 10% in the training loss.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call