Abstract

Federated learning (FL) enables clients learning a shared global model from multiple distributed devices while keeping training data locally. Due to the synchronous update mode between server and devices, the straggler problem has become a significant bottleneck for efficient FL. Existing approaches attempt to tackle this issue by using asynchronous-based model aggregation. However, these researches are only from the perspective of changing global model updating manner to mitigate straggler effect. They do not investigate the intrinsic reasons for the generation of the straggler effect, which could not fundamentally solve this problem. Furthermore, asynchronous-based approaches usually ignore those slow-responding but important local updates while frequently aggregating fast-responding ones during the whole training process, which may come with degradation in model accuracy. Thus, we propose FedTCR, a novel Federated learning approach via Taming Computing Resources. FedTCR includes a coarse-grained logical computing cluster construction algorithm (LCC) and a fine-grained intra-cluster collaborative training mechanism (ICT) as part of the FL process. The computing resource heterogeneity among devices and the communication frequency between devices and the server are indirectly tamed during this process, which substantially resolves the straggler problem and significantly improves the communication efficiency for FL. Experimental results show that FedTCR achieves much faster training performance, reducing the communication cost by up to 8.59,times while improving 13.85% model accuracy, compared to state-of-the-art FL methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call