Abstract
In the big data era, the data are generated from different sources or observed from different views. These data are referred to as multi-view data. Unleashing the power of knowledge in multi-view data is very important in big data mining and analysis. This calls for advanced techniques that consider the diversity of different views, while fusing these data. Multi-view Clustering (MvC) has attracted increasing attention in recent years by aiming to exploit complementary and consensus information across multiple views. This paper summarizes a large number of multi-view clustering algorithms, provides a taxonomy according to the mechanisms and principles involved, and classifies these algorithms into five categories, namely, co-training style algorithms, multi-kernel learning, multiview graph clustering, multi-view subspace clustering, and multi-task multi-view clustering. Therein, multi-view graph clustering is further categorized as graph-based, network-based, and spectral-based methods. Multi-view subspace clustering is further divided into subspace learning-based, and non-negative matrix factorization-based methods. This paper does not only introduce the mechanisms for each category of methods, but also gives a few examples for how these techniques are used. In addition, it lists some publically available multi-view datasets. Overall, this paper serves as an introductory text and survey for multi-view clustering.
Highlights
In many real-world applications of big data mining and analysis, data are collected from different sources in diverse domains or obtained from various feature collectors
We provide a brief summary, including multi-modal clustering based on Markov random field[175], multi-view clustering ensembles based on multi-view spectral clustering and multi-view kernel K-means clustering with ensemble technology[176], bi-level weighted Multi-view Clustering (MvC) based on Kmeans[177–180], and multi-view fuzzy clustering[181–184]
This paper surveyed most of the existing algorithms and technologies of MvC, and classified these MvC algorithms into five categories, i.e., co-training style algorithm, multi-kernel learning, multi-view graph clustering, multi-view subspace clustering, and multi-task multi-view clustering
Summary
In many real-world applications of big data mining and analysis, data are collected from different sources in diverse domains or obtained from various feature collectors. Big Data Mining and Analytics, June 2018, 1(2): 83-107 organize and summarize them in five categories: Co-training style algorithms: This category of methods treats multi-view data by using cotraining strategy. It bootstraps the clustering of different views by using the prior or learning knowledge from one another. Multi-kernel learning: This category of methods uses predefined kernels corresponding to different views, and combines these kernels either linearly or non-linearly in order to improve clustering performance. Multi-view graph clustering: This category of methods seeks to find a fusion graph (or network) across all views and uses graph-cut algorithms or other technologies (e.g., spectral clustering) on the fusion graph in order to produce the clustering result.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have