Abstract

Data clustering is an unsupervised technique that can be used to partition the data into groups based on the similarities of the retrieved objects using different distance metrics like Euclidean, cosine, etc. In contrast to Euclidean, the cosine computes the object's similarity by considering both the magnitude and direction of the data vectors. As a result, it performed far better than a standard Euclidean distance metric in applications involving real-time data clustering. The initial k-value (clustering tendency) is required by top clustering techniques like k-means and hierarchical approaches to determine the clusters' quality. Users with knowledge can assign the k-value. However, sometimes the right k-value in such algorithms may need to be assigned. After a thorough review of the work, it was discovered that the visual technique known as visual assessment of (cluster) tendency (VAT) effectively addresses the clustering tendency issue. It uses the Euclidean metric to find the similarity features in its algorithm. Another enhanced visual technique, cosinebased VAT (cVAT), outperformed the VAT for text data and speech clustering applications. However, the similarity features are extracted about a single viewpoint in cVAT. This paper develops the multi-viewpoints-based cosine similarity measure (MVPCSM) for a more informative assessment. Instead of using a single reference point like a typical cosine measure, the MVPCSM generates precise similarity characteristics using several views. The performance of the existing and proposed technique (MVPCSM-VAT) is evaluated using clustering accuracy (CA) and normalized mutual information (NMI). It has been demonstrated that the proposed MVPCSM-VAT is 15-25% more efficient than VAT and cVAT in terms of the parameters of CA and NMI. The proposed method successfully obtains more quality data clusters than MVS-VAT.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call