Abstract

The k-means algorithm is characterized by simple implementation and fast speed, and is the most widely used clustering algorithm. Aiming at the shortcomings of k-means algorithm in noise sensitivity in high-dimensional sparse data sets, the IB k-means (Interpolation-based k-means clustering) algorithm is proposed. Based on the k-means algorithm, the genetic algorithm is used for interpolation, which solves the problem that the sparse data in k-means clustering is easy to merge. The experimental results show that compared with several improved k-means-based clustering methods, the proposed method can achieve better clustering effect and better deal with clustering in high-dimensional sparse data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call