Towards blockchain-enabled decentralized and secure federated learning

Du Xu,Katinka Wolter,Xuyang Ma

doi:10.1016/j.ins.2024.120368

Abstract

The conventional machine learning process typically operates under the premise of centralized data aggregation, where all data is collected at a central location for model training. However, this raises substantial privacy concerns when the data contains private information. In this context, federated learning has emerged as a prominent solution for preserving privacy in machine learning. This innovative paradigm allows multiple data owners, or clients, to collaboratively train a machine learning model while keeping their local data unshared. A federated learning task is typically initiated by companies, often referred to as model owners, who do not possess enough training data and are willing to financially remunerate clients who contribute to the development of the federated learning model. This situation demands a trading platform that enables model owners to effectively select clients, while ensuring robustness against malicious clients who execute poisoning attacks for unfair financial gain. To address these issues, we design a contribution-based exploration-exploitation mechanism implemented as a smart contract. This mechanism cherry-picks clients with high data quality based on the Shapley value, which is calculated based on local models to evaluate the contribution of each client. Unlike other state-of-the-art security mechanisms, the proposed mechanism can adapt to various scenarios with heterogeneous data distribution and various attacks, while mitigating the effect of malicious behaviors without compromising training accuracy. To accelerate the time-consuming calculation of the Shapley value, we design a parallel computing algorithm that partitions blockchain nodes into multiple shards and distributes calculation tasks among them. The algorithm improves efficiency and tolerates potential false calculation results from malicious nodes.

Full Text