Abstract
Pairwise constraint propagation studies the problem of propagating the scarce pairwise constraints across the entire dataset. Effective propagation algorithms have previously been designed based on the graph-based semi-supervised learning framework. Therefore, these previous constraint propagation methods rely critically on a good similarity measure over the data points. Improper or noisy similarity measurements may dramatically degrade the performance of the constraint propagation algorithms. In this paper, we make attempt to exploit the available pairwise constraints to learn a new set of similarities, which are consistent with the supervisory information in the pairwise constraints, before propagating these initial constraints. Our method is a local learning algorithm. More specifically, we compute the similarities at each data point through simultaneously minimizing the local reconstruction error and local constraint error. The proposed method has been tested in the constrained clustering tasks on eight real-life datasets and then shown to achieve significant improvements with respect to the state of the arts.
Highlights
Prior knowledge on whether two objects belong to the same cluster or not are expressed respectively in terms of must-link constraints and cannot-link constraints, the pairwise constraints [2, 19]
As to linear neighborhood propagation (LNP)+PCP, it can benefit from the increasing of the number of constraints, but because the similarity graph learning process of LNP only considers the “local reconstruction error”, but does not takes any supervisory information, e.g. the pairwise constraints, into consideration, the performances of our learning similarity for constraint propagation (LSCP) are still significantly better than the LNP+PCP on each image dataset
When the number of pairwise constraints grows, we can find a unanimous and obvious improvement in the performance of our LSCP on all the four datasets, but affinity propagation (AP), spectral learning (SL) and constrained clustering by spectral kernel learning [10] (CCSKL) do not present this trend in constrained clustering
Summary
Prior knowledge on whether two objects belong to the same cluster or not are expressed respectively in terms of must-link constraints and cannot-link constraints, the pairwise constraints [2, 19]. It is hard to infer instance labels only from pairwise constraints, especially for multi-class data. This means that pairwise constraints are weaker and more general than the explicit labels of data. While it is possible to infer pairwise constraints from domain knowledge or user feedback, in practice, the availability of such pairwise constraints is scarce. It can not achieve much more performance improvement to only adjust the similarities between constrained data
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.