Abstract

Finite mixture models can adequately model population heterogeneity when this heterogeneity arises from a finite number of relatively homogeneous clusters. A good example of such a situation is modeling market segmentation. Order selection in mixture models, i.e. selecting the correct number of components in the mixture model, however, is a problem which has not been satisfactorily resolved. Existing simulation results in the literature do not completely agree with each other. Moreover, it appears that the performance of diff erent proposed selection methods is aff ected by the type of model and the parameter values. Furthermore, most existing results are based on simulations where the true generating model is identical to one of the models in the candidate set. In order to partly fill this gap we carried out a simulation study for finite mixture models of normal linear regressions. We included several types of model mis-specifi cation to study the robustness of 18 order selection methods. Furthermore, we compared the performance of these selection methods based on unpenalized and penalized estimates of the model parameters. The results indicate that order selection based on penalized estimates greatly improves the success rates of all order selection methods. The most successful methods were MRC, MRCk, MDL2, ICL and ICL-BIC but not one method was consistently good or best for all types of model mis-speci fication.

Highlights

  • Finite mixtures present a very attractive modeling framework to increase model flexibility without the high-dimensional parameter spaces used in non-parametric or mixed modeling (Mclachlan and Peel 2000)

  • In order to partly fill this gap we carried out a simulation study for finite mixture models of normal linear regressions

  • Order selection in finite mixture models is not a simple problem which seems to be confirmed in our simulation

Read more

Summary

Introduction

Finite mixtures present a very attractive modeling framework to increase model flexibility without the high-dimensional parameter spaces used in non-parametric or mixed modeling (Mclachlan and Peel 2000). A regular statistical model is too rigid to adequately represent possible heterogeneity in the population This heterogeneity can often be captured by a mixture of parametric models. The most important of these complications is that of selecting the correct number of components (Mclachlan and Peel 2000) which we will refer to as order selection. Not surprisingly, this has generated a lot of theoretical and applied research and.

Finite Mixtures of Linear Regressions
Mixture Regression in Practice
Penalizing the Likelihood
Order Selection
Previous Results
Experimental Design
Results and Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call