Abstract

Due to the widespread use of distributed data mining techniques in a variety of areas, the issue of protecting the privacy of sensitive data has received increasing attention in recent years. Privacy-preserving distributed data mining (PPDDM) focuses on decentralized data analysis without the disclosure of sensitive information from data owner. However, the previous PPDDM mostly works on a limited amount of labeled data. In contrast to the real world, unlabeled data is abundance and labeled data is scarce. The objectives of this paper are to study and to analyze privacy-preserving properties of semi-supervised learning (SSL) algorithm with the combination of labeled and unlabeled data, where data is distributed among multiple data owners. In this paper we propose a Privacy-preserving Distributed Data Mining (PPDDM) method by designing a reliable application of secure MPC to semi-supervised tri-training algorithms. We simulate the original tri-training algorithm and tri-training algorithm with secure MPC using a different types of classifiers and datasets. The simulation results show that tri-training in secure MPC has almost same accuracy compared to original tri-training algorithm. We also compare execution time in addition to performance evaluation of tri-training in secure and the original tri-training algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call