Abstract

Distributed privacy-preserving data mining (DPPDM) has been attracting enormous attention. It allows multiple participants to jointly use their datasets as a whole to train a model while preserving data privacy. Many works have been looking into the semi-supervised learning in DPPDM, to combine both labeled and unlabeled data for better performance. However, these works only provide transductive solutions, which means they can only give predictions for instances in the training set, and not for any new data sample beyond the set. Meanwhile, these methods are constructed with approximate calculations for security concerns, leading to sub-optimal results and limited effectiveness. In this paper, a mixture-model-based solution is proposed for inductive and effective semi-supervised learning in DPPDM. Our motivation lies in combining mixture models and graph-based methods to construct an anchor mixture with the ability of label prediction. We also propose an optimization process, which is accurately calculated through secure computation protocols, to achieve effectiveness. Experiments on synthetic and real-world datasets demonstrate that our proposal outperforms state-of-the-art methods in both transductive and inductive tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.