Abstract

The K-means (KM) and Self-Organizing Map (SOM) are two popular and very different techniques for clustering data. Both techniques require a set of training data which are used in an iterative process to find clusters in this set. For the KM, the number of clusters must be preassigned before a training session begins, whereas, the number of SOM clusters is determined after a single training is completed. In this paper we compare the clustering performance of these two methods using data from three mineral spectral libraries whose samples have been hierarchically labeled with Class, Subclass, and Group names. These names are used to determine the overall mineralogical purity of the clusterings as a function of cluster number. The degree of cluster overlap is also determined as a function of cluster number using the Davies-Bouldin (DB) index. We show that, in general, the purity and overlap of KM and SOM derived clusters differ significantly for cluster numbers small compared to the number of training samples. The KM clusters are less pure and overlap more than SOM clusters. The ramifications of these results on the accuracy of classification of spectra not used for training is discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.