An improved parallel K-means algorithm based on MapReduce

Jianmin Xu,Yanfang Shou,Dongbo Zhang

doi:10.1504/ijes.2017.10005724

An improved parallel K-means algorithm based on MapReduce

Jianmin Xu, Yanfang Shou + Show 1 more

https://doi.org/10.1504/ijes.2017.10005724

Copy DOI

Journal: International Journal of Embedded Systems	Publication Date: Jan 1, 2017
Citations: 4

Affiliation: South China University of Technology

#Canopy Algorithm #K-means Algorithm + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

The K-means algorithm is one of the most popular clustering algorithms. However, it is sensitive to initialised partitions and circular dataset. To address this problem, this paper introduces a CK-means clustering algorithm based on the K-means algorithm and the Canopy algorithm, which uses the MapReduce programming model of Hadoop platform. The experimental results prove that the CK-means algorithm has strong advantages for processing large datasets. The theoretical analysis shows that the CK-means algorithm and the traditional algorithm are of the same order of magnitude. The experimental results on artificial data show that the improved algorithm is better than the traditional algorithm in terms of acceleration ratio, accuracy and expansion rate. An experiment on real data is performed to obtain appropriate parameters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: International Journal of Embedded Systems

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.