RETRACTED ARTICLE: Innovative study on clustering center and distance measurement of K-means algorithm: mapreduce efficient parallel algorithm based on user data of JD mall

Yang Liu,Xinxin Du,Shuaifeng Ma

doi:10.1007/s10660-021-09458-z

Abstract

The traditional K-means algorithm is very sensitive to the selection of clustering centers and the calculation of distances, so the algorithm easily converges to a locally optimal solution. In addition, the traditional algorithm has slow convergence speed and low clustering accuracy, as well as memory bottleneck problems when processing massive data. Therefore, an improved K-means algorithm is proposed in this paper. In this algorithm, the selection of the initial points in the traditional clustering algorithm is improved first, and then a new global measure, the effective distance measure, is proposed. Its main idea is to calculate the effective distance between two data samples by sparse reconstruction. Finally, on the basis of the MapReduce framework, the efficiency of the algorithm is further improved by adjusting the Hadoop cluster. Based on the real customer data from the JD Mall dataset, this paper introduces the DBI, Rand and other indicators to evaluate the clustering effects of various algorithms. The results show that the proposed algorithm not only has good convergence and accuracy but also achieves better performances than those of other compared algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

RETRACTED ARTICLE: Innovative study on clustering center and distance measurement of K-means algorithm: mapreduce efficient parallel algorithm based on user data of JD mall

Abstract

Talk to us

Similar Papers

More From: Electronic Commerce Research

Lead the way for us

Journal: Electronic Commerce Research	Publication Date: Mar 31, 2021
Citations: 5

Similar Papers

A Novel Effective Distance Measure and a Relevant Algorithm for Optimizing the Initial Cluster Centroids of K-means
Yang Liu ... Xinxin Du
IEEE Access | VOL. -
Yang Liu, et. al.Yang Liu ... Xinxin Du
15 Dec 2020
IEEE Access | VOL. -

R-Reference points based k-means algorithm
Ching-Lin Wang ... Shyr-Shen Yu
Information Sciences | VOL. 610
Ching-Lin Wang, et. al.Ching-Lin Wang ... Shyr-Shen Yu
30 Jul 2022
Information Sciences | VOL. 610

K-Means Clustering Algorithm–Based Functional Magnetic Resonance for Evaluation of Regular Hemodialysis on Brain Function of Patients with End-Stage Renal Disease
Yan Cheng ... Yan Yu
Computational and Mathematical Methods in Medicine | VOL. 2022
Yan Cheng, et. al.Yan Cheng ... Yan Yu
21 Jun 2022
Computational and Mathematical Methods in Medicine | VOL. 2022

Real-time fault detection approach of software under big data environment
Xianrui Jian
-
Xianrui JianXianrui Jian
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RETRACTED ARTICLE: Innovative study on clustering center and distance measurement of K-means algorithm: mapreduce efficient parallel algorithm based on user data of JD mall

Abstract

Talk to us

Similar Papers

More From: Electronic Commerce Research