A Data Augmentation Method for Vertical Federated Learning

Jianfei Zhang,Yuchen Jiang

doi:10.1155/2022/6596925

Abstract

Federated learning (FL) enables various organizations to jointly train one single model without revealing their private data to each other. The FL can be classified as horizontal federated learning (HFL) and vertical federated learning (VFL) according to the distribution of overlap samples and overlap features in the dataset. VFL allows various organizations to share machine learning based on the overlap samples, each one of which has the same identity. However, VFL suffers from insufficient number of overlap samples among all participants. Hence, the shortage of overlap data results in a worse performance of the global model. In this article, we propose a data augmentation method, FedDA, which is based on the generative adversarial network (GAN) to increase the number of training data. We generate more overlap data by learning the features of finite overlap data and many locally existing nonoverlap data, which expand the availability for training the overlap dataset. A series of experiments were executed on both MNIST and CIFAR-10. The results show that FedDA can efficiently utilize nonoverlap samples to enhance the effect of the data augmentation. It can generate high-quality overlap samples and expand the set of overlap samples. Thus, when the VFL is short of overlap samples, FedDA can provide abundant training data to improve the performance of the VFL model.

Highlights

Machine learning is used to explore the hidden information from a large volume of existing data, and obviously, it is tedious that those data are from a single participant
To verify the performance of the method we proposed, we designed a series of experiments based on the dataset of MNIST and CIFAR-10
Each participant has a large amount of nonoverlap data, but it is not utilized for vertical federated learning

Summary

Introduction

Machine learning is used to explore the hidden information from a large volume of existing data, and obviously, it is tedious that those data are from a single participant. Horizontal federated learning is usually applied to the scenarios where the datasets of participants have nearly the same feature space but different sample identity spaces. On the other hand, is used in the scenarios where the datasets of participants have nearly the same sample identity space but different feature spaces. Lots of research focus on how to establish a vertical federated learning model. They usually can only use the overlap data between participants. When the amount of overlap data between multiple organizations is scarce, performing vertical federated learning will undoubtedly produce a terrible model effect. (1) We design a novel federated data augmentation method for vertical federated learning, namely, FedDA, to expand the number of available samples (2) We proposed to use adversarial generative networks for data augmentation in vertical federated learning (3) We conducted a range of experiments on FedDA to prove its effectiveness and studied the quality of generated data by FedDA under different data distributions on two different typical datasets of MNIST and CIFAR-10

Related Work

The Proposed Approach

Experiment Evaluation

Split 2 Input 3 Output 4 Calculation

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Wireless Communications and Mobile Computing	Publication Date: Jan 24, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Data Augmentation Method for Vertical Federated Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wireless Communications and Mobile Computing

Lead the way for us

Similar Papers

Optimizing Data Augmentation for Semantic Segmentation on Small-Scale Dataset
Rui Ma ... Pin Tao
-
Rui Ma, et. al.Rui Ma ... Pin Tao
15 Jun 2019
15 Jun 2019

Vis-NIR Spectroscopy Combined with GAN Data Augmentation for Predicting Soil Nutrients in Degraded Alpine Meadows on the Qinghai-Tibet Plateau.
Chuanli Jiang ... Yuanyuan Ding
Sensors | VOL. 23
Chuanli Jiang, et. al.Chuanli Jiang ... Yuanyuan Ding
02 Apr 2023
Sensors | VOL. 23

A review of synthetic and augmented training data for machine learning in ultrasonic non-destructive evaluation
Sebastian Uhlig ... Matthias Wolff
Ultrasonics | VOL. 134
Sebastian Uhlig, et. al.Sebastian Uhlig ... Matthias Wolff
18 May 2023
Ultrasonics | VOL. 134

A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation
Azal Ahmad Khan ... Rohitash Chandra
Expert Systems with Applications | VOL. 244
Azal Ahmad Khan, et. al.Azal Ahmad Khan ... Rohitash Chandra
10 Dec 2023
Expert Systems with Applications | VOL. 244

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Data Augmentation Method for Vertical Federated Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wireless Communications and Mobile Computing