As big data continues to evolve, cluster analysis still has a place. Among them, the K-means algorithm is the most widely used method in the field of clustering, which can cause unstable clustering results due to the random selection of the initial clustering center of mass. In this paper, an improved honey badger optimization algorithm is proposed: (1) The population is initialized using sin chaos to make the population uniformly distributed. (2) The density factor is improved to enhance the optimization accuracy of the population. (3) A nonlinear inertia weight factor is introduced to prevent honey badger individuals from relying on the behavior of past individuals during position updating. (4) To improve the diversity of solutions, random opposition learning is performed on the optimal individuals. The improved algorithm outperforms the comparison algorithm in terms of performance through experiments on 23 benchmark test functions. Finally, in this paper, the improved algorithm is applied to K-means clustering and experiments are conducted on three data sets from the UCI data set. The results show that the improved honey badger optimized K-means algorithm improves the clustering effect over the traditional K-means algorithm.
Read full abstract