Abstract

In real world, data describing the same learning task may be distributed in different institutions (called participants), and these participants cannot share their own data due to the need of privacy protection. How to select an optimal feature subset on the basis of ensuring the data privacy of each participant is an important and challenging topic. At the same time, data on part participants show imbalanced or even miss some classes, further increasing the difficulty of solving this kind of problem. In view of this, by introducing a trusted third party to process and integrate optimal information from participants, this paper proposes a multi-participant federated evolutionary feature selection algorithm for imbalanced data under privacy protection. Firstly, a multi-level joint sample filling strategy with multiple participants (called sampling-rough selection-fine tuning strategy) is proposed to fill imbalanced or empty classes on each participant while ensuring data privacy. Then, a federated evolutionary feature selection method based on particle swarm optimization (PSO) is proposed by periodically sharing the optimal feature subsets obtained by PSOs on participants. Finally, the proposed algorithm is applied to 15 test datasets, and compared with several typical imbalanced feature selection algorithms and two kinds of ensemble feature selection algorithms. The experimental results show that the proposed algorithm can significantly promote the ability of each participant on processing imbalanced datasets, and improve the classification accuracy of obtained feature subsets while protecting the participants' data privacy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call