An Efficient Checkpoint Strategy for Federated Learning on Heterogeneous Fault-Prone Nodes

Jeonghun Kim,Sunggu Lee

doi:10.3390/electronics13061007

Abstract

Federated learning (FL) is a distributed machine learning method in which client nodes train deep neural network models locally using their own training data and then send that trained model to a server, which then aggregates all of the trained models into a globally trained model. This protects personal information while enabling machine learning with vast amounts of data through parallel learning. Nodes that train local models are typically mobile or edge devices from which data can be easily obtained. These devices typically run on batteries and use wireless communication, which limits their power, making their computing performance and reliability significantly lower than that of high-performance computing servers. Therefore, training takes a long time, and if something goes wrong, the client may have to start training again from the beginning. If this happens frequently, the training of the global model may slow down and the final performance may deteriorate. In a general computing system, a checkpointing method can be used to solve this problem, but applying an existing checkpointing method to FL may result in excessive overheads. This paper proposes a new FL method for situations with many fault-prone nodes that efficiently utilizes checkpoints.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Efficient Checkpoint Strategy for Federated Learning on Heterogeneous Fault-Prone Nodes

Abstract

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Mar 7, 2024
License type: CC BY 4.0

Similar Papers

FlexiFed: Personalized Federated Learning for Edge Clients with Heterogeneous Model Architectures
Kaibin Wang ... Qiang He
-
Kaibin Wang, et. al.Kaibin Wang ... Qiang He
30 Apr 2023
30 Apr 2023

The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining
Yi Liu ... Xingliang Yuan
-
Yi Liu, et. al.Yi Liu ... Xingliang Yuan
02 May 2022
02 May 2022

CrowdFL: A Marketplace for Crowdsourced Federated Learning
Daifei Feng ... Dusit Niyato
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36
Daifei Feng, et. al.Daifei Feng ... Dusit Niyato
28 Jun 2022
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 36

Wireless Federated Learning With Hybrid Local and Centralized Training: A Latency Minimization Design
Ning Huang ... Yuan Wu
IEEE Journal of Selected Topics in Signal Processing | VOL. 17
Ning Huang, et. al.Ning Huang ... Yuan Wu
01 Jan 2023
IEEE Journal of Selected Topics in Signal Processing | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient Checkpoint Strategy for Federated Learning on Heterogeneous Fault-Prone Nodes

Abstract

Talk to us

Similar Papers

More From: Electronics