Abstract

Spectral Clustering (SC) has been the main subject of intensive research due to its remarkable clustering performance. Despite its successes, most existing SC methods suffer from several critical issues. Firstly, they typically involve two independent stages, i.e., learning the continuous relaxation matrix followed by the discretization of the cluster indicator matrix. This two-stage approach can result in suboptimal solutions that negatively impact the clustering performance. Secondly, these methods are hard to maintain the balance property of clusters inherent in many real-world data, which restricts their practical applicability. Finally, these methods are computationally expensive and hence unable to handle large-scale datasets. In light of these limitations, we present a novel Discrete and Balanced Spectral Clustering with Scalability (DBSC) model that integrates the learning the continuous relaxation matrix and the discrete cluster indicator matrix into a single step. Moreover, the proposed model also maintains the size of each cluster approximately equal, thereby achieving soft-balanced clustering. What's more, the DBSC model incorporates an anchor-based strategy to improve its scalability to large-scale datasets. The experimental results demonstrate that our proposed model outperforms existing methods in terms of both clustering performance and balance performance. Specifically, the clustering accuracy of DBSC on CMUPIE data achieved a 17.93% improvement compared with that of the SOTA methods (LABIN, EBSC, etc.).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call