Abstract

Recent studies have suggested using relative distance comparisons as constraints to represent domain knowledge. A natural extension to relative comparisons is the combination of two comparisons defined on the same set of three instances. Constraints in this form, termed Relative Constraints, provide a unified knowledge representation for both partitional and hierarchical clusterings. But many key properties of relative constraints remain unknown. In this paper, we answer the following important questions that enable the broader application of relative constraints in general clustering problems: Feasibility: Does there exist a clustering that satisfies a given set of relative constraints? (consistency of constraints) Completeness: Given a set of consistent relative constraints, how can one derive a complete clustering without running into dead-ends? Informativeness: How can one extract the most informative relative constraints from given knowledge sources? We show that any hierarchical domain knowledge can be easily represented by relative constraints. We further present a hierarchical algorithm that finds a clustering satisfying all given constraints in polynomial time. Experiments showed that our algorithm achieves significantly higher accuracy than the existing metric learning approach based on relative comparisons.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call