An Improved K-Means Algorithm Based on Evidence Distance

Ailin Zhu,Zexi Hua,Lingwei Miao,Yu Shi,Yongchuan Tang

doi:10.3390/e23111550

Abstract

The main influencing factors of the clustering effect of the k-means algorithm are the selection of the initial clustering center and the distance measurement between the sample points. The traditional k-mean algorithm uses Euclidean distance to measure the distance between sample points, thus it suffers from low differentiation of attributes between sample points and is prone to local optimal solutions. For this feature, this paper proposes an improved k-means algorithm based on evidence distance. Firstly, the attribute values of sample points are modelled as the basic probability assignment (BPA) of sample points. Then, the traditional Euclidean distance is replaced by the evidence distance for measuring the distance between sample points, and finally k-means clustering is carried out using UCI data. Experimental comparisons are made with the traditional k-means algorithm, the k-means algorithm based on the aggregation distance parameter, and the Gaussian mixture model. The experimental results show that the improved k-means algorithm based on evidence distance proposed in this paper has a better clustering effect and the convergence of the algorithm is also better.

Highlights

With the rapid development of technologies such as cloud computing and the internet of things [1,2], the number of connected devices is increasing and the data generated during human–computer interaction and system operation is growing exponentially [3,4,5]
Clustering is a method of data mining [9]
In order to explore whether a new distance measure can be obtained by using evidence distance instead of Euclidean distance, an improved k-means algorithm based on evidence distance is proposed in this paper

Summary

Introduction

With the rapid development of technologies such as cloud computing and the internet of things [1,2], the number of connected devices is increasing and the data generated during human–computer interaction and system operation is growing exponentially [3,4,5]. Tang et al [24] proposed the d-k-means algorithm, which weighs the influence of density and distance on clustering based on traditional algorithms, and weights the data. In order to explore whether a new distance measure can be obtained by using evidence distance instead of Euclidean distance, an improved k-means algorithm based on evidence distance is proposed in this paper. Through validation on the UCI data set and toy data set, and experimental comparison with the traditional kmeans algorithm, and the k-means algorithm based on the aggregation distance parameter and the Gaussian mixture model, the improved k-means algorithm in this paper has better clustering effect and convergence. The third section introduces the algorithmic ideas and motivation of this paper and proposes a k-means algorithm based on evidence distance improvement.

D-S Evidence Theory

Algorithm Flow

Experimental Evaluation Indicators

Experimental Procedure

Iris Data Set Test Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Entropy	Publication Date: Nov 21, 2021
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Improved K-Means Algorithm Based on Evidence Distance

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy

Lead the way for us

Similar Papers

R-Reference points based k-means algorithm
Ching-Lin Wang ... Shyr-Shen Yu
Information Sciences | VOL. 610
Ching-Lin Wang, et. al.Ching-Lin Wang ... Shyr-Shen Yu
30 Jul 2022
Information Sciences | VOL. 610

An Improved K-Means Algorithm Based on Contour Similarity
Jing Zhao ... Yanke Bao
Mathematics | VOL. 12
Jing Zhao, et. al.Jing Zhao ... Yanke Bao
15 Jul 2024
Mathematics | VOL. 12

RETRACTED ARTICLE: Innovative study on clustering center and distance measurement of K-means algorithm: mapreduce efficient parallel algorithm based on user data of JD mall
Yang Liu ... Xinxin Du
Electronic Commerce Research | VOL. 23
Yang Liu, et. al.Yang Liu ... Xinxin Du
31 Mar 2021
Electronic Commerce Research | VOL. 23

Effective Clustering Analysis Based on New Designed CVI and Improved Clustering Algorithms
Erzhou Zhu ... Peng Wen
-
Erzhou Zhu, et. al.Erzhou Zhu ... Peng Wen
01 Dec 2018
01 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Improved K-Means Algorithm Based on Evidence Distance

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Entropy