Abstract

K-means clustering algorithm is an unsupervised learning method with simple principles, easy implementation, and strong adaptability. Aiming at the disadvantages of this algorithm that the clusters’ number is difficult to determine, sensitive to the initial cluster center, and the clustering result is easily impacted by the outliers, this paper proposes an improved clustering algorithm based on density selection, which compares the neighborhood density of each sample and the average density of all the samples, treats the samples with lower density as the outliers or isolated points, and then deletes them. After data pre-processing, the cluster validity index is modified to obtain the optimal clusters’ number by minimizing the cluster validity index, and then optimizes the initial cluster center by density selection strategy. Finally, it is verified by the experiment that the improved algorithm has better accuracy than the traditional K-means algorithm, and it can converge to the global minimum of SSE faster.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.