Abstract
Privacy has become a major concern in data mining as it is utilized in many important applications. Distributed privacy-preserving data mining (DPPDM) is one of the techniques to address this concern, which focuses on protecting private information of members in distributed systems during data mining. As DPPDM is widely discussed in recent works, the semi-supervised manner of learning still draws less attention in this field. In this paper, a mixture-model-based semi-supervised DPPDM method is proposed. By introducing our method, a site in a distributed system is able to initiate a learning process using labeled data of its own and unlabeled data from all the sites. During the process, no individual data of any site is revealed to others, no information about data can be traced back to any specific site, and only the initiating site learns the result. We propose a parameter-masking privacy-preserving Expectation-Maximization (EM) algorithm and a mixture-model-based semi-supervised learning algorithm as the two main steps of our method. Experiments on both synthetic and real-world data demonstrate the effectiveness of the proposed method.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.