Abstract

Clustering algorithm is the main field in collaborative computing of social network. How to evaluate clustering results accurately has become a hot spot in clustering algorithm research. Commonly used evaluation indexes are SC, DBI and CHI. There are two shortcomings in the calculation of three in

Highlights

  • Clustering is an important algorithm for data mining in collaborative computing of social network

  • There are three main indexes for the evaluation of unlabeled clustering results: Calinski-Harabasz Index[1], Davies-Bouldin Index[2] and Silhouette Coefficient[3].They define the calculation methods of intra-cluster relations and inter-cluster relations respectively, and evaluate the clustering results according to the combination of intracluster relations and inter-cluster relations

  • In order to solve the above two problems, and make the clustering results better evaluated, based on the calculation process of three indexes, this paper proposes new indexes NSC(New SC), NDBI(New DBI) and NCHI(New CHI)

Read more

Summary

Introduction

Clustering is an important algorithm for data mining in collaborative computing of social network. When the feature vectors change, the problem that the calculated values of indexes change greatly is solved to some extent. When the feature vectors change, the calculated value of the relationship between the feature vectors will only change in the [0, 1] interval. Keep the number of clusters and the elements of each vector unchanged. When the objects in the clusters change, the problem that the calculated values of indexes change tinily is solved to some extent. In calculating the relationship between feature vectors, taking the number of elements in each cluster as a coefficient, expand the calculation values of inter-cluster relations and intra-cluster relations, but the range of increase of the two calculated values is different.

Related Works
Definition of Relevant Symbols
Calinski-Harabasz Index
Davies-Bouldin Index
Silhouette Coefficient
Problem Description
Solutions to problems
Redefinition of Calinski-Harabasz Index
Redefinition of Davies-Bouldin Index
Redefinition of Silhouette Coefficient
Data Set and Evaluation Standard
An Example of Calculating Indexes
Clustering Results Testing-Changing Feature Vectors
Clustering Results Testing-Changing Objects in Clusters
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call