Abstract
The Integrated Completed Likelihood (ICL) criterion was introduced by Biernacki, Celeux and Govaert (2000) in the model-based clustering framework to select a relevant number of classes and has been used by statisticians in various application areas. A theoretical study of ICL is proposed. A contrast related to the clustering objective is introduced: the conditional classification likelihood. An estimator and model selection criteria are deduced. The properties of these new procedures are studied and ICL is proved to be an approximation of one of these criteria. We contrast these results with the current leading point of view about ICL, that it would not be consistent. Moreover these results give insights into the class notion underlying ICL and feed a reflection on the class notion in clustering. General results on penalized minimum contrast criteria and upper-bounds of the bracketing entropy in parametric situations are derived, which can be useful per se. Practical solutions for the computation of the introduced procedures are proposed, notably an adapted EM algorithm and a new initialization method for EM-like algorithms which helps to improve the estimation in Gaussian mixture models.
Highlights
Model-based clustering is introduced in Sections 1.1 and 1.2
The main topic of this work is the choice of the number of classes in a model-based clustering framework, and the choice of the number of components of a Gaussian mixture
Even for data arising from a mixture distribution, a relevant number of classes may differ from the true number of components of the mixture
Summary
The main topic of this work is the choice of the number of classes in a model-based clustering framework, and the choice of the number of components of a Gaussian mixture. We prove that it is a penalized contrast criterion with a criterion which is different from the standard likelihood: this justifies why this is not surprising, nor a drawback, that ICL does not asymptotically select the “true” number of components, even when the “true” model is considered. The reason why we introduce this new contrast Lcc (Section 2.1) is not that we believe it a priori to be the better one for a clustering purpose, but rather that it enables to theoretically study and understand ICL.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.