The K-means algorithm is one of the classical algorithms of clustering. However, as the data set increases, the computational cost of clustering becomes higher. The orthogonal matching pursuit algorithm is a classic signal reconstruction algorithm. The paper improves its algorithm based on compression learning and applies it to the K-means algorithm, which uses the sketch of the original data set to estimate the cluster center. The experiment results show that the clustering effect of this method is similar to that of K-means algorithm, because the size of the sketch is independent of the size of the original data set, only related to the number of centroids K and the dimension n of the data, which reduces the computational complexity of the algorithm. For large data sets, experiments show that the improved algorithm is more optimized than the traditional K-means algorithm.
Read full abstract