Abstract

In the machine learning and pattern recognition fraternity, uncertain data clustering is an essential job because uncertainty in data makes the clustering process more difficult. Recently, multi-view clustering is gaining more attention towards data miners for certain data because it produces good results compared to grouping based on a single viewpoint. In uncertain data clustering, similarity measure plays an imperative role. However, state-of-the-art similarity measures suffer from several limitations. For example, when two distributions of two uncertain data are heavily overlapped in locations, then Geometric similarity measure alone is not sufficient. On the other hand, similarity measure based on probability distribution is not enough when two uncertain data are not closed to each other or completely separated. In this study, induced kernel distance and Jeffrey-divergence are fused by the degree of overlap concerning each view of a dataset to construct a self-adaptive mixture similarity measure (SAM). The SAM is further used with pairwise co-regularization in multi-view spectral clustering for grouping uncertain data. The proof of convergence of the objective function of the proposed clustering algorithm is also presented in this study. All the experiments are carried out on nine real-world deterministic datasets, three real-life and one synthetic uncertain datasets. Nine real-world deterministic datasets are further converted into uncertain datasets before executing all the clustering algorithms. Experimental results illustrate that the proposed algorithm outperforms nine state-of-the-art methods. The comparison is made using five clustering evaluation metrics. The proposed method is also tested using null hypothesis significance tests.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call