Abstract

AbstractClustering is an unsupervised process of grouping unlabeled data on the basis of similarity. To perform clustering, without knowing the number of clusters for any data set, is not pragmatic. The proposed method is used to find out the optimal number of clusters for unclassified textual data with the help of internal cluster validation techniques. The internal indexing used for validation in this work are Silhouette index, Davies Bouldin index, and Calinski Harabasz score along with K-means clustering algorithm. The result obtained through this method is further validated by the Elbow method of finding the number of clusters.KeywordsClusteringUnsupervisedUnclassifiedInternal cluster validationIndexing

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call