Abstract

Cluster validity indexes can be used to evaluate the fitness of data partitions produced by a clustering algorithm. Validity indexes are usually independent of clustering algorithms. However, the values of validity indexes may be heavily influenced by noise and outliers. These noise and outliers may not influence the results from clustering algorithms, but they may affect the values of validity indexes. In the literature, there is little discussion about the robustness of cluster validity indexes. In this paper, we analyze the robustness of a validity index using the ϕ function of M-estimate and then propose several robust-type validity indexes. Firstly, we discuss the validity measure on a single data point and focus on those validity indexes that can be categorized as the mean type of validity indexes. We then propose median-type validity indexes that are robust to noise and outliers. Comparative examples with numerical and real data sets show that the proposed median-type validity indexes work better than the mean-type validity indexes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.