Toward Scalable and Robust AIoT via Decentralized Federated Learning

Pinyarash Pinyoanuntapong,Chen Chen,Pu Wang,Minwoo Lee,Wesley Houston Huff

doi:10.1109/iotm.006.2100216

Pinyarash Pinyoanuntapong, Chen Chen + Show 3 more

https://doi.org/10.1109/iotm.006.2100216

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

As Artificial Intelligence of Things (AIoT) has become increasingly important for modern AI applications, federated learning (FL) is envisioned to be the enabling technology for AIoT, especially for large-scale, data privacy-preserving scenarios. However, most existing FL is managed in a centralized manner (CFL), which confronts the limitations of scalability given the AioT device explosion. The key challenge faced by CFL is the communication bottleneck at the central model aggregation server, which leads to a high server-to-worker communication delay and thus severely slows down the model convergence. To address this challenge, this article introduces a generic decentralized FL (DFL) framework that can operate in either synchronous (Sync-DFL) mode or asynchronous (Async-DFL) mode to alleviate the high communication congestion around the central server. Moreover, Async-DFL is the first DFL in the literature to provide a generic FL framework that is fully asynchronous and able to completely avoid worker waiting, which leads to robust distributed model training in the inherently heterogeneous IoT environments, where stragglers (i.e., slow devices) are very common due to the largely varying computing/networking speeds of IoT devices. Our DFL framework is implemented, deployed, and experimented with in both simulation and physical testbeds. The results show that Async-DFL can accelerate the convergence speed of model training twice as fast as CFL, while maintaining convergence accuracy and effectively combating the impact of the stragglers.

Full Text