Abstract

Appropriately handling the scalability of clustering is a long-standing challenge for the study of clustering techniques and is of fundamental interest to researchers in the community of data mining and knowledge discovery. In comparison to other clustering methods, hierarchical clustering demonstrates better interpretability of clustering results but poor scalability while handling large-scale data. Thus, more comprehensive studies on this problem need to be conducted. This paper develops a new scalable hierarchical clustering model called Election Tree, which can detect the most representative point for each sub-cluster via the process of node election in split data and adjust the members in sub-clusters by the operations of node merging and swap. Extensive experiments on real-world datasets reveal that the proposed computational framework has better clustering accuracy as opposed to the competing baseline methods. Meanwhile, with respect to the scalability tests on incremental synthetic datasets, the results show that the new model has a significantly lower time consumption than the state-of-the-art hierarchical clustering models such as PERCH, GRINCH, SCC and other classic baselines.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.