Abstract
Using clustering method to detect useful patterns in large datasets has attracted considerable interest recently. The HKM clustering algorithm (Hierarchical K-means) is very efficient in large-scale data analysis. It has been widely used to build visual vocabulary for large scale video/image retrieval system. However, the speed and even the accuracy of hierarchical K-means clustering algorithm still have room to be improved. In this paper, we propose a Parallel N-path quantification hierarchical K-means clustering algorithm which improves on the hierarchical K-means clustering algorithm in the following ways. Firstly, we replace the Euclidean kernel with the Hellinger kernel to improve the accuracy. Secondly, the Greedy N-best Paths Labeling method is adopted to improve the clustering accuracy. Thirdly, the multi-core processors-based parallel clustering algorithm is proposed. Our results confirm that the proposed clustering algorithm is much faster and more effective.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Pattern Recognition and Artificial Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.