Abstract
Spectral Clustering (SC) has been the main subject of intensive research due to its remarkable clustering performance. Despite its successes, most existing SC methods suffer from several critical issues. Firstly, they typically involve two independent stages, i.e., learning the continuous relaxation matrix followed by the discretization of the cluster indicator matrix. This two-stage approach can result in suboptimal solutions that negatively impact the clustering performance. Secondly, these methods are hard to maintain the balance property of clusters inherent in many real-world data, which restricts their practical applicability. Finally, these methods are computationally expensive and hence unable to handle large-scale datasets. In light of these limitations, we present a novel Discrete and Balanced Spectral Clustering with Scalability (DBSC) model that integrates the learning the continuous relaxation matrix and the discrete cluster indicator matrix into a single step. Moreover, the proposed model also maintains the size of each cluster approximately equal, thereby achieving soft-balanced clustering. What's more, the DBSC model incorporates an anchor-based strategy to improve its scalability to large-scale datasets. The experimental results demonstrate that our proposed model outperforms existing methods in terms of both clustering performance and balance performance. Specifically, the clustering accuracy of DBSC on CMUPIE data achieved a 17.93% improvement compared with that of the SOTA methods (LABIN, EBSC, etc.).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Pattern Analysis and Machine Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.