Exploring the number of groups in robust model-based clustering

L A García-Escudero,A Gordaliza,C Matrán,A Mayo-Iscar

doi:10.1007/s11222-010-9194-z

L A García-Escudero, A Gordaliza + Show 2 more

Open Access

https://doi.org/10.1007/s11222-010-9194-z

Copy DOI

Journal: Statistics and Computing	Publication Date: Jul 28, 2010
Citations: 89	License type: cc-by

Affiliation: University of Valladolid

Abstract

Two key questions in Clustering problems are how to determine the number of groups properly and measure the strength of group-assignments. These questions are specially involved when the presence of certain fraction of outlying data is also expected. Any answer to these two key questions should depend on the assumed probabilistic-model, the allowed group scatters and what we understand by noise. With this in mind, some exploratory trimming-based tools are presented in this work together with their justifications. The monitoring of optimal values reached when solving a robust clustering criteria and the use of some discriminant factors are the basis for these exploratory tools.

Highlights

Two key questions in Clustering problems are how to choose the number of groups properly and measure the strength of group-assignments
For a given TCLUST clustering solution, we introduce some “confirmatory” graphical tools that will help us to evaluate the quality of the cluster assignments and the strength of the trimming decisions
We have proposed to take advantage of this fact when trying to choose a suitable k in Clustering problems

Summary

Introduction

Two key questions in Clustering problems are how to choose the number of groups properly and measure the strength of group-assignments. The so-called “spurious-outliers model” assumes the presence of a fraction α of the data generated by an extraneous mechanism that may be trimmed off or discarded Within this framework, the TCLUST methodology presented in Garcıa-Escudero et al (2008) is able to handle different types of constraints for the group scatter matrices which allows for addressing point b) through a restriction on the group scatter matrix eigenvalues. The result of applying the TCLUST to this data set appears in Figure 1,(a) when k = 3, α = 0 and a large value for the group scatters constraint constant c = 50 are chosen. Proposition 3 in the Appendix shows that the here presented discriminant factors consistently estimate some population discriminant factors defined for the theoretical (unknown) distribution that generates our data set

Simulated Examples

Clustering and mixture approaches

Objective

Old Faithful Geyser data

Findings

Swiss Bank Notes data

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring the number of groups in robust model-based clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics and Computing

Lead the way for us

Similar Papers

Robust subspace clustering network with dual-domain regularization
Fangfang Wu ... Peng Yuan
Pattern recognition letters | VOL. 149
Fangfang Wu, et. al.Fangfang Wu ... Peng Yuan
01 Sep 2021
Pattern recognition letters | VOL. 149

Robust model-based clustering via mixtures of skew-t distributions with missing information
Wan-Lun Wang ... Tsung-I Lin
Advances in Data Analysis and Classification | VOL. 9
Wan-Lun Wang, et. al.Wan-Lun Wang ... Tsung-I Lin
17 Nov 2015
Advances in Data Analysis and Classification | VOL. 9

Robust model-based clustering with mild and gross outliers
Alessio Farcomeni ... Antonio Punzo
Test (Madrid, Spain) | VOL. 29
Alessio Farcomeni, et. al.Alessio Farcomeni ... Antonio Punzo
28 Nov 2019
Test (Madrid, Spain) | VOL. 29

Robust Bayesian cluster enumeration based on the [formula omitted] distribution
Freweyni K Teklehaymanot ... Abdelhak M Zoubir
Signal processing | VOL. 182
Freweyni K Teklehaymanot, et. al.Freweyni K Teklehaymanot ... Abdelhak M Zoubir
01 Dec 2020
Signal processing | VOL. 182

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring the number of groups in robust model-based clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Statistics and Computing