Abstract

Cluster analysis discovers natural structures from different perspectives of data objects and has become an effective method in data mining. The emergence of semi-supervised clustering techniques has improved the performance of unsupervised clustering algorithms. Clustering with guidance information is a clustering method variant that uses pairwise constraints based on background knowledge. This method increases the interpretability of the results through a knowledge-guided perspective but simultaneously suffers from the problem of constraint conflict. This paper designs knowledge augmentation-based soft constraints as a new pairwise constraint representation and proposes a Soft Constraints Kmeans (SCop-Kmeans) method to resolve constraint conflicts. By describing constraint knowledge from multiple perspectives, the association strength of pairwise constraints is calculated to obtain the assignment basis of objects. SCop-Kmeans can solve the sample allocation conflict problem caused by the contradiction between different constraints and improve the clustering stability. Finally, experiments are performed using UCI public standard datasets. The proposed method further improves the accuracy of clustering and performs well in experiments with different numbers of constraints, which shows that the proposed method has advantages in using constraint information to guide clustering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call