As an important part of machine learning, clustering methods have been continuously paid attention to. Current clustering methods divide data objects usually based on Euclidean metric, which is a basic and effective metric method. However, with the high dimensionality of data and the diversification of data representation, the complexity of the spatial structure of real-world data continues to rise. Classical clustering methods face many challenges such as insufficient clustering effectiveness, the sensitivity of clustering method parameters, and lack of stability of clustering results. Aiming at the above problems, this paper designs a non-Euclidean metric and constructs a multi-granularity staged clustering method based on the metric. First of all, this paper uses the sequential relationship of each feature of the data to construct a similarity measure between objects from the perspective of positive and negative granularity to improve the clustering algorithm’s understanding of complex spatial structure data. Secondly, this paper designs the attenuation-diffusion pattern divides and conquers according to the distribution characteristics of data objects in different patterns, and uses the heuristic idea to effectively cluster the data in stages from local to global. Again, based on the above, this paper proposes a clustering method based on multi-positive-negative granularity and attenuation-diffusion pattern, which can effectively deal with the challenges brought by complex spatial structure data to clustering methods. Finally, the effectiveness and robustness of the proposed method and advanced clustering methods are compared and analyzed on UCI real data sets. Experimental results show that the method proposed in this paper has obvious advantages in clustering results on complex spatial structure data. In addition, in the two directions of non-Euclidean metrics and multi-granularity clustering, the method proposed in this paper provides a new perspective for effectively dealing with the design of clustering methods on complex spatial structure data.
Read full abstract