Abstract

Traditionally within the unsupervised learning paradigm, hierarchical and partitional clustering techniques have been shown to produce better results when provided with partial information, leading to a renewed attention towards this topic. Constrained clustering is a semi-supervised learning problem that combines classic clustering techniques with background knowledge given in the form of a set of constraints. In this paper, we propose to incorporate constraints into the clustering process in three phases: the first phase is devoted to quantify constraint relevance and to learn a metric matrix according to such relevance, a second phase computing similarities between instances by means of the reconstruction coefficient and pairwise distances, and a third stage performing agglomerative hierarchical clustering with a reward-style stepped affinity function favoring merges satisfying the higher possible number of constraints. Experimental results, supported by Bayesian statistical testing, show a consistent improvement in favor of our proposal over previous approaches to the constrained clustering problem.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.