FLoX: Federated Learning with FaaS at the Edge

Nikita Kotsehub,Nathaniel Hudson,Kyle Chard,Matt Baughman,Ryan Chard,Ian Foster,Panos Patros,Omer Rana

doi:10.1109/escience55777.2022.00016

Nikita Kotsehub, Nathaniel Hudson + Show 6 more

Open Access

https://doi.org/10.1109/escience55777.2022.00016

Copy DOI

Abstract

Federated learning (FL) is a technique for distributed machine learning that enables the use of siloed and distributed data. With FL, individual machine learning models are trained separately and then only model parameters (e.g., weights in a neural network) are shared and aggregated to create a global model, allowing data to remain in its original environment. While many applications can benefit from FL, existing frameworks are incomplete, cumbersome, and environment-dependent. To address these issues, we present FLoX, an FL framework built on the funcX federated serverless computing platform. FLoX decouples FL model training/inference from infrastructure management and thus enables users to easily deploy FL models on one or more remote computers with a single line of Python code. We evaluate FLoX using three benchmark datasets deployed on ten heterogeneous and distributed compute endpoints. We show that FLoX incurs minimal overhead, especially with respect to the large communication overheads between endpoints for data transfer. We show how balancing the number of samples and epochs with respect to the capacities of participating endpoints can significantly reduce training time with minimal reduction in accuracy. Finally, we show that global models consistently outperform any single model on average by 8%.

Full Text