Abstract

A number of literature reports have shown that multi-view clustering can acquire a better performance on complete multi-view data. However, real-world data usually suffers from missing some samples in each view and has a small number of labeled samples. Additionally, almost all existing multi-view clustering models do not execute incomplete multi-view data well and fail to fully utilize the labeled samples to reduce computational complexity, which precludes them from practical application. In view of these problems, this paper proposes a novel framework called Semi-supervised Multi-View Clustering with Weighted Anchor Graph Embedding (SMVC_WAGE), which is conceptually simple and efficiently generates high-quality clustering results in practice. Specifically, we introduce a simple and effective anchor strategy. Based on selected anchor points, we can exploit the intrinsic and extrinsic view information to bridge all samples and capture more reliable nonlinear relations, which greatly enhances efficiency and improves stableness. Meanwhile, we construct the global fused graph compatibly across multiple views via a parameter-free graph fusion mechanism which directly coalesces the view-wise graphs. To this end, the proposed method can not only deal with complete multi-view clustering well but also be easily extended to incomplete multi-view cases. Experimental results clearly show that our algorithm surpasses some state-of-the-art competitors in clustering ability and time cost.

Highlights

  • In many practical applications, a growing amount of realworld data naturally appears in multiple views, which are called multi-view data, where the data may be characterized by different attributes or be collected from diverse sources

  • An image can be described with different features, such as SIFT (Scale-Invariant Feature Transform), HOG (Histogram of Oriented Gradient), LBP (Local Binary Pattern), etc. [1]; a piece of specific news can be reported to multiple news organizations [2]; and a web page can be represented as a web page with links, texts, and images, respectively [3]

  • To explore the effectiveness of our method, these complete multi-view methods are performed on five complete multi-view datasets with different percentages of labeled data, where the experimental results are enumerated in Tables 2–6 in the form of ACC, Normalized Mutual Information (NMI), and Purity. rough the analysis of these tables, we can get some observations as follows: (1) From Tables 2–6, we can see that the clustering performances are quite different in single-view clustering scenarios for all multi-view datasets. is is mainly because each view has a difference in the feature scales and distributions. e experimental results imply that it is necessary to research how to appropriately combine multiple views to enhance the clustering performance

Read more

Summary

Introduction

A growing amount of realworld data naturally appears in multiple views, which are called multi-view data, where the data may be characterized by different attributes or be collected from diverse sources. An individual view has a wealth of information to execute machine learning tasks, but it ignores leveraging the consistent and complementary information from multiple views [4]. Proper use of such information has the possibility of elevating various machine learning performances. From the perspective of involved technologies, most of the existing literature reports are roughly classified into three types: matrix factorizationbased, graph-based, and subspace-based approaches. As Kang et al [5] pointed out, matrix factorization-based

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call