Semi-supervised clustering with multi-viewpoint based similarity measure

Yang Yan Yang Yan,Lihui Chen,Duc Thang Nguyen

doi:10.1109/ijcnn.2012.6252650

Abstract

The traditional (dis)similarity measure between a pair of data objects in a clustering method uses only a single viewpoint, which is usually the origin as the only reference point. Recently a novel multi-viewpoint based similarity (MVS) measure [1] has been proposed, which utilizes many different viewpoints in similarity measure and it has been successfully applied in data clustering. In this paper, we study how a semi-supervised MVS-based clustering can be developed by incorporating some prior knowledge in the form of class labels, when they are available to the user. A novel search-based semi-supervised clustering method called CMVS is proposed in the MVS manner with the help of a small percentage of objects being labeled. Two new criterion functions for clustering have been formulated accordingly, when only these labeled objects are considered as the viewpoints in the multi-viewpoints based similarity measure. Theoretical discussion has been conducted to ensure the newly proposed criterion functions make good use of the prior knowledge in terms of similarity measure, besides seeding. Empirical study is performed on various benchmark datasets to demonstrate the effectiveness and verify the merit of our proposed semi-supervised MVS clustering.

Full Text