Abstract
Learning an appropriate distance measure under supervision of side information has become a topic of significant interest within machine learning community. In this paper, we address the problem of metric learning for constrained clustering by considering three important issues: (1) considering importance degree for constraints, (2) preserving the topological structure of data, and (3) preserving some natural distribution properties in the data. This work provides a unified way to handle different issues in constrained clustering by learning an appropriate distance measure. It has modeled the first issue by injecting the importance degree of constraints directly into an objective function. The topological structure of data is preserved by minimizing the reconstruction error of data in the target space. Finally we addressed the issue of preserving natural distribution properties in the data by using the proximity information of data. We have proposed two different methods to address the above mentioned issues. The first approach learns a linear transformation of data into a target space (linear-model) and the second one uses kernel functions to learn an appropriate distance measure (non-linear-model). Experiments show that considering these issues significantly improves clustering accuracy.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have