Abstract

A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter <em>m</em> was evaluated to determine the distribution of membership values with variation in <em>m</em> from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call