Abstract

The massive uncertain data generated by network applications has potential value, and it is of great significance to carry out clustering analysis of uncertain data. However, the uncertainty of data brings a serious challenge to traditional clustering algorithms. There are some problems in the existing clustering algorithms for uncertain data. (1) Some algorithms have a lot of meaningless distance calculations when calculating the distance of uncertain objects, and then the calculation complexity of algorithms increases. (2) The data model in clustering algorithms results in the loss of data distribution information, and then the accuracy of the algorithms decreases. In this paper, we propose an uncertain tuple density clustering (UTDC) algorithm for uncertain data. Firstly, we extract the data distribution features of uncertain instances and construct uncertain tuples by introducing the cloud model, which realizes the pruning of clustering objects. Secondly, we apply the EW distance to the traditional density clustering algorithm DBSCAN, which completes the density clustering for uncertain data. The experimental results show that comparing with UK-Means and FDBSCAN, UTDC algorithm effectively reduces the computational complexity and improves the accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call