Abstract
Multi-view spectral clustering (MVSC) has become a popular approach to harvest knowledge about group information from multiple views of data, owned by different parties. A high quality MVSC approach usually requires collecting massive amount of data from each view party, in order to perform MVSC algorithm in a centralized manner. However, such centralized MVSC approach raises serious privacy concerns, not only in terms of the sensitivity property of many real-world data such as medical or financial records, but also in terms of the regulations from authorities to preclude centralized operations. Hence it is crucial to design new paradigm for training spectral clustering model on multi-view data in industrial scenarios. In this article, we propose a distributed and secure framework named <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Federated Multi-view Spectral Clustering</i> (FMSC), in which a group of view parties collaboratively perform a MVSC model, but couldn’t learn the data of other participants. FMSC is inspired by the concept of federated learning, and utilize Homomorphic Encryption (HE) and Differential Privacy (DP) to achieve secure and private clustering. We conduct a series of extensive experiments to verify the effectiveness of FMSC on both synthetic and real-world datasets. Evaluations show that FMSC achieves respectable clustering results over conventional centralized approaches.
Highlights
M ULTI-VIEW clustering has been receiving considerable attentions in recent years
Inspired by the growing popularity of Federated Learning [15], which investigates a distributed architecture that a group of view data parties jointly train a machine learning model and nobody can learn the data of other participants, we propose a new approach named Federated Multi-view Spectral Clustering (FMSC) that addresses the aforementioned challenges in a principled way
Under FMSC, we demonstrate that no private information has leaked to the central server and little view specific information has leaked to other view parties with theoretical assurance
Summary
M ULTI-VIEW clustering has been receiving considerable attentions in recent years. The reason comes from the fact that objects are usually presented in different views. Several regulations have been proposed to prohibit privacy leakage in commercial scenarios, such as European Union General Data Protection Regulation (GDPR) [14] These approaches raise a critical question: how to derive a global multi-view spectral clustering structure in each view-data party if the view specific data are located on diverse distributed parties? Inspired by the growing popularity of Federated Learning [15], which investigates a distributed architecture that a group of view data parties jointly train a machine learning model and nobody can learn the data of other participants, we propose a new approach named Federated Multi-view Spectral Clustering (FMSC) that addresses the aforementioned challenges in a principled way.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have