Abstract
Clustering is an important Machine Learning task, which aims at discovering the implicit structure of data. Applying a clustering algorithm is easy but since clustering is an unsupervised task, tuning it so that the results is appropriate to the expert expectations is much less obvious. To overcome this, expert knowledge can be integrated into a clustering process; this is generally formalized as constraints on the desired output, thus leading to constrained clustering. There are two lines of research for clustering: distance based clustering, where data are grouped into clusters according to their dissimilarity and conceptual clustering, where a cluster must be a concept that is a set of objects and a set of properties that describe them. This second approach relies on Formal Concept Analysis and benefits from advances in Pattern Mining. [66] has shown the interest of declarative approaches for pattern mining and has led to a new research direction for clustering that is interested in the use of declarative frameworks, such as Integer Linear Programming, Constraint Programming or SAT. This has several advantages: finding a global optimum, integrating different kinds of constraints, even complex ones in a clustering process and even combining conceptual and distance-based clustering. In this paper we present an inventory of constraints and a survey of declarative methods for constrained clustering.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.