In cluster analysis one often finds several partitions of a data set using different clustering methods and algorithms set with a variety of hyperparameters and tunings. The number of clusters K is one of the most relevant of such hyperparameters. Cluster selection is the task of choosing the desired partitions. The Bootstrap Quadratic Scoring is a recently introduced method where the cluster selection is performed by optimizing a score attached to a partition that is based on the quadratic discriminant function. Previously, we proposed the estimation of this cluster score via bootstrap resampling and investigated the proposed estimator based on numerical experiments and real data applications. However, that earlier work did not provide theoretical guarantees. In this paper, we fill that gap. We study the asymptotic behavior of the scoring method and show that the proposed estimator converges to well-defined population counterparts.
Read full abstract