Abstract

Semi-supervised clustering (i.e., clustering with knowledge-based constraints) has emerged as an important variant of the traditional clustering paradigms. However, most existing semi-supervised clustering algorithms are designed for partitional clustering methods and few research efforts have been reported on semi-supervised hierarchical clustering methods. In addition, current semi-supervised clustering methods have been focused on the use of background information in the form of instance level must-link and cannot-link constraints, which are not suitable for hierarchical clustering where data objects are linked over different hierarchy levels. In this paper, we propose a novel semi-supervised hierarchical clustering framework based on ultra-metric dendrogram distance. The proposed framework is able to incorporate triple-wise relative constraints. We establish the connection between hierarchical clustering and ultra-metric transformation of dissimilarity matrix and propose two techniques (the constrained optimization technique and the transitive dissimilarity based technique) for semi-supervised hierarchical clustering. Experimental results demonstrate the effectiveness and the efficiency of our proposed methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call