Abstract

Semi-supervised clustering exploits a small quantity of supervised information to improve the accuracy of data clustering. In this paper, a framework for semi-supervised clustering is proposed. This framework is capable of integrating with a traditional clustering algorithm seamlessly, and particularly useful for the application where a traditional clustering is designated to use.In the proposed framework, discriminative random fields (DRFs) are employed to model the consistency between the result of a traditional clustering algorithm and the supervised information with the assumption of semi-supervised learning. The semi-supervised clustering problem is thus formulated as finding the label configuration with the maximum a posteriori (MAP) probability of the DRF. A procedure based on the iterated conditional modes algorithm and a metric-learning algorithm is developed to find a suboptimal MAP solution of the DRF. The proposed approach has been tested against various data sets. Experimental results demonstrate that our approach can enhance the clustering accuracy, and thus prove the feasibility of the proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call