Abstract

It is well known that, in the presence of outliers, the single linkage algorithm generally fails to identify clusters. In this paper, we construct a new version of this algorithm, less sensitive to outliers, and study both its theoretical properties and its practical behavior. In particular, we provide an oracle-type inequality which guarantees that our procedure recovers clusters with high probability under mild assumptions on the distribution of the outliers. Using this inequality, we prove the consistency of our method and exhibit rates of convergence in various situations. The performance of this approach is also assessed through simulation studies. A thorough comparison with several classical clustering algorithms on simulated data is presented.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call