Abstract

Since social media, virtual communities and networks rapidly grow, multiview data become more popular. In general, multiview data always contain different feature components in different views. Although these data are extracted in different ways (views) from diverse settings and domains, they are used to describe the same samples, which make them highly related. Hence, applying (single-view) clustering methods for multiview data poses difficulty in achieving desirable clustering results. Thus, multiview clustering methods should be developed that will utilize available multiview information. Most of multiview clustering techniques currently use k-means due to its conceptual simplicity, and use fuzzy c-means (FCM) that the datapoints can belong to more than one cluster based on their membership degrees from 0 to 1. However, the use of k-means or FCM may degrade its performance due to the presence of noise and outliers, especially on large or high-dimensional datasets. The constraint imposed on the membership degrees of k-means and FCM tends to assign a corresponding high membership value to an outlier or a noisy data point. To address these drawbacks, possibilistic c-means (PCM) relaxes the membership constraint of k-means and FCM so that outliers and noisy datapoints can be properly identified. On the other hand, there are various extensions of k-means and FCM for multiview data, but no extension of PCM for multiview data was made in the literature. Thus, we use PCM in our proposed multiview clustering model. In this article, we propose novel weighted multiview PCM algorithms designed for clustering multiview data as well as view and feature weights on PCM approaches, called W-MV-PCM and W-MV-PCM with L2 regularization (W-MV-PCM-L2). In multiview clustering, different views may vary with respect to its importance and each view may contain some irrelevant features. In the proposed algorithms, a learning scheme is constructed to compute for the view weights, and feature weights within each view. This scheme will be able to identify the importance of each view and, at the same time, it will also identify and select relevant features in each view. Comparisons of W-MV-PCM-L2 with existing multiview clustering algorithms are made on both synthetic and real datasets. The experimental results are evaluated using accuracy rate (AR) and external validity indexes, such as Rand index (RI) and normalized mutual information (NMI). The proposed W-MV-PCM-L2 algorithm with comparisons of existing algorithms under criteria of AR, RI, and NMI shows that it is a feasible and effective multiview clustering algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call