Abstract

The problem of determining whether clusters are present in numerical data (tendency assessment) is an important first step of cluster analysis. One tool for cluster tendency assessment is the visual assessment of tendency (VAT) algorithm. VAT and improved VAT (iVAT) produce an image that provides visual evidence about the number of clusters to seek in the original dataset. These methods have been successful in determining potential cluster structure in various datasets, but they can be computationally expensive for datasets with a very large number of samples. A scalable version of iVAT called siVAT approximates iVAT images, but siVAT can be computationally expensive for big datasets. In this article, we introduce a modification of siVAT called siVAT+ which approximates cluster heat maps for large volumes of high dimensional data much more rapidly than siVAT. We compare siVAT+ with siVAT on six large, high dimensional datasets. Experimental results confirm that siVAT+ obtains images similar to siVAT images in a few seconds, and is 8 - 55 times faster than siVAT.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.