FedMed: A Federated Learning Framework for Language Modeling.

Xing Wu,Zhaowang Liang,Jianjia Wang

doi:10.3390/s20144048

Xing Wu, Zhaowang Liang + Show 1 more

Open Access

https://doi.org/10.3390/s20144048

Copy DOI

Abstract

Federated learning (FL) is a privacy-preserving technique for training a vast amount of decentralized data and making inferences on mobile devices. As a typical language modeling problem, mobile keyboard prediction aims at suggesting a probable next word or phrase and facilitating the human-machine interaction in a virtual keyboard of the smartphone or laptop. Mobile keyboard prediction with FL hopes to satisfy the growing demand that high-level data privacy be preserved in artificial intelligence applications even with the distributed models training. However, there are two major problems in the federated optimization for the prediction: (1) aggregating model parameters on the server-side and (2) reducing communication costs caused by model weights collection. To address the above issues, traditional FL methods simply use averaging aggregation or ignore communication costs. We propose a novel Federated Mediation (FedMed) framework with the adaptive aggregation, mediation incentive scheme, and topK strategy to address the model aggregation and communication costs. The performance is evaluated in terms of perplexity and communication rounds. Experiments are conducted on three datasets (i.e., Penn Treebank, WikiText-2, and Yelp) and the results demonstrate that our FedMed framework achieves robust performance and outperforms baseline approaches.

Highlights

With the emerging breakthroughs of new industrial revolution—Industry 4.0 and Internet of Things (IoT) technologies, society is stepping into a smart era where all objects are enclosed with the network of interconnectivity and automation by the intelligent digital technique
I.e., FedSGD, federated averaging (FedAvg), and FedAtt, in comparison with our Federated Mediation (FedMed)
Our goal is to verify our proposed FedMed compared to FedAvg and FedAtt in terms of federated aggregation and communication efficiency

Summary

Introduction

With the emerging breakthroughs of new industrial revolution—Industry 4.0 and Internet of Things (IoT) technologies, society is stepping into a smart era where all objects are enclosed with the network of interconnectivity and automation by the intelligent digital technique. Data is subject to attack and poisoning during transferring process. Wearable devices such as the smart band, mobile devices can be used for health status management [1,2], smart querying [3]. The virtual keyboard can recommend several probable words options while the user is inputting on a mobile device such as the smartphone, iPad, or laptop. Language models—in particular, those using recurrent neural network (RNN) [5]—demonstrated exceptional performance in word prediction tasks [6,7,8]. Conventional language model learning is a kind of centralized approach in which all scattered devices data is sent to the server for training. The data from tens of thousands of mobile devices is so incredibly enormous that

Objectives

Methods

Results

Conclusion