High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length

Nizar Bouguila,Djemel Ziou

doi:10.1109/tpami.2007.1095

Abstract

We consider the problem of determining the structure of high-dimensional data, without prior knowledge of the number of clusters. Data are represented by a finite mixture model based on the generalized Dirichlet distribution. The generalized Dirichlet distribution has a more general covariance structure than the Dirichlet distribution and offers high flexibility and ease of use for the approximation of both symmetric and asymmetric distributions. This makes the generalized Dirichlet distribution more practical and useful. An important problem in mixture modeling is the determination of the number of clusters. Indeed, a mixture with too many or too few components may not be appropriate to approximate the true model. Here, we consider the application of the minimum message length (MML) principle to determine the number of clusters. The MML is derived so as to choose the number of clusters in the mixture model which best describes the data. A comparison with other selection criteria is performed. The validation involves synthetic data, real data clustering, and two interesting real applications: classification of web pages, and texture database summarization for efficient retrieval.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Oct 1, 2007
Citations: 206

Similar Papers

MML-Based Approach for High-Dimensional Unsupervised Learning Using the Generalized Dirichlet Mixture
N Bouguila ... D Ziou
-
N Bouguila, et. al.N Bouguila ... D Ziou
20 Jun 2005
20 Jun 2005

Generalized Dirichlet distribution in Bayesian analysis
Tzu-Tsung Wong
Applied Mathematics and Computation | VOL. 97
Tzu-Tsung WongTzu-Tsung Wong
16 Nov 1998
Applied Mathematics and Computation | VOL. 97

A hybrid SEM algorithm for high-dimensional unsupervised learning using a finite generalized Dirichlet mixture
N. Bouguila ... D. Ziou
IEEE Transactions on Image Processing | VOL. 15
N. Bouguila, et. al.N. Bouguila ... D. Ziou
01 Sep 2006
IEEE Transactions on Image Processing | VOL. 15

Unsupervised Learning of Correlated Multivariate Gaussian Mixture Models Using MML
Yudi Agusta ... David L Dowe
-
Yudi Agusta, et. al.Yudi Agusta ... David L Dowe
01 Jan 2003
01 Jan 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence