Weighted k‐means (WK‐means) is a well‐known method for automated feature weight learning in a conventional k‐means clustering framework. In this paper, we analytically explore the strong consistency of the WK‐means algorithm under independent and identically distributed sampling of the data points. The choice of dissimilarity measure plays a key role in data partitioning and detecting the inherent groups existing in a dataset. We propose a proof of strong consistency of the WK‐means algorithm when the dissimilarity measure used is assumed to be a nearmetric. The proof can be further extended to those dissimilarity measures which are an increasing function of a nearmetric. Through detailed experiments, we demonstrate that WK‐means‐type algorithms, equipped with a nearmetric, can be pretty effective especially when some of the features are unimportant in revealing the cluster structure of the dataset.
Read full abstract