Abstract

Summary Due to the lack of development in the probabilistic and statistical aspects of clustering research, clustering procedures are often regarded as heuristics generating artificial clusters from a given set of sample data. In this paper, a clustering procedure that is useful for drawing statistical inference about the underlying population from a random sample is developed. It is based on the uniformly consistent kth nearest neighbour density estimate, and is applicable to both case-by-variable data matrices and case-by-case dissimilarity matrices. The proposed clustering procedure is shown to be asymptotically consistent for high-density clusters in several dimensions, and its small-sample behaviour is illustrated by an empirical example.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call