Abstract

In the network environment, data from various industries is dynamic and large-scale. Traditional clustering algorithms struggle to effectively utilize existing clustering results when faced with continuously evolving data, which makes the incremental grid-based clustering highly regarded. However, the existing incremental grid-based clustering algorithms fail to adequately consider the impact of newly added data on the original cluster structure. To address this issue, the key grids based batch-incremental CLIQUE clustering algorithm is proposed. The algorithm designates the incremental data mapping grids, which are or their neighbour girds are mixed with original data, as key grids to fully consider the cluster structure changes caused by the incremental data. Moreover, the cluster similarity coefficient based on grid features is introduced to measure density differences between the incremental data and the original clusters, and the cluster membership degree is defined to further consider the cluster membership of boundary sparse grid data and the identification of noise points. All of which ensures that the algorithm can adaptively create, merge or split clusters with the arrival of new data. Experimental results show that the proposed algorithm can adaptively adjust the cluster structure during incremental clustering, outperforming in accuracy and efficiency when clustering large-scale, dynamically changing data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call