Initial Value Filtering Optimizes Fast Global K-Means

Jintao Han,Haiming Li

doi:10.4236/jcc.2019.710005

Abstract

K-means clustering algorithm is an important algorithm in unsupervised learning and plays an important role in big data processing, computer vision and other research fields. However, due to its sensitivity to initial partition, outliers, noise and other factors, the clustering results in data analysis, image segmentation and other fields are unstable and weak in robustness. Based on the fast global K-means clustering algorithm, this paper proposed an improved K-means clustering algorithm. Through the neighborhood filtering mechanism, the points in the neighborhood of the selected initial clustering center have not participated in the selection of the next initial clustering center, which can effectively reduce the randomness of initial partition and improve the efficiency of initial partition. Mahalanobis distance was used in the clustering process to better consider the global nature of data. Compared with the traditional clustering algorithm and other optimization algorithms, the results of real data set testing are significantly improved.

Highlights

With the development of artificial intelligence, researchers have explored more and more application scenarios for intelligent algorithms [1], and various machine learning algorithms have become research hotspots
Based on the fast global K-means clustering algorithm, this paper proposed an improved K-means clustering algorithm
Clustering experiments were carried out on traditional K-means algorithm, Fast Global K-means algorithm (FGK-means), fast global K-means algorithm based on neighborhood screening (RFGK-means), and fast global K-means algorithm based on neighborhood screening and Markov distance (RMFGK-means), respectively

Summary

Introduction

With the development of artificial intelligence, researchers have explored more and more application scenarios for intelligent algorithms [1], and various machine learning algorithms have become research hotspots. In the traditional K-means algorithm, the number of clustering centers is observed from the data according to experience, and the initial location of clustering centers is random. This results in the weak stability of the algorithm, which is affected by noise and outliers. Paper [5] used the method of residual analysis to automatically obtain the initial cluster center and number of class clusters from the decision graph, which solves the problem of manually specifying the number of class clusters This method is complex to implement and has poor effect on the sparsely distributed data set. Mahalanobis distance [13] is used in the process of clustering, which improves the global consideration of the clustering process and makes the algorithm more suitable for application in image processing

Traditional K-Means Clustering

Fast Global K-Means

Global K-Means Algorithm

Initial Value Filtering Optimizes Fast Global K-Means

Neighborhood Filter

Mahalanobis Distance

Average Error

Method:

Experiment and Results

Simulation Result

Experimental Analysis

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Initial Value Filtering Optimizes Fast Global K-Means

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer and Communications

Lead the way for us

Journal: Journal of Computer and Communications	Publication Date: Jan 1, 2019
License type: CC BY 4.0

Similar Papers

Density Peak Clustering Algorithm Based on High Density Connection with Entropy Optimization
Bin Ma ... Heng Zhang
-
Bin Ma, et. al.Bin Ma ... Heng Zhang
22 Jul 2022
22 Jul 2022

An Adaptive Initial Cluster Centers Selection Algorithm for High-Dimensional Partition Clustering
Kun Niu ... Yidan Fan
-
Kun Niu, et. al.Kun Niu ... Yidan Fan
01 Nov 2017
01 Nov 2017

Initialization for K-means Clustering using Voronoi Diagram
Prasanta K Jana ... Damodar Reddy
Procedia Technology | VOL. 4
Prasanta K Jana, et. al.Prasanta K Jana ... Damodar Reddy
01 Jan 2012
Procedia Technology | VOL. 4

An Improved Clustering Algorithm Based on Density Peak and Nearest Neighbors
Junchuang Yang ... Kexin Wen
Mathematical Problems in Engineering | VOL. 2022
Junchuang Yang, et. al.Junchuang Yang ... Kexin Wen
10 Aug 2022
Mathematical Problems in Engineering | VOL. 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Initial Value Filtering Optimizes Fast Global K-Means

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer and Communications