Extending the Akaike Information Criterion to Mixture Regression Models

Prasad A Naik,Peide Shi,Chih-Ling Tsai

doi:10.1198/016214506000000861

Abstract

We examine the problem of jointly selecting the number of components and variables in finite mixture regression models. We find that the Akaike information criterion is unsatisfactory for this purpose because it overestimates the number of components, which in turn results in incorrect variables being retained in the model. Therefore, we derive a new information criterion, the mixture regression criterion (MRC), that yields marked improvement in model selection due to what we call the “clustering penalty function.” Moreover, we prove the asymptotic efficiency of the MRC. We show that it performs well in Monte Carlo studies for the same or different covariates across components with equal or unequal sample sizes. We also present an empirical example on sales territory management to illustrate the application and efficacy of the MRC. Finally, we generalize the MRC to mixture quasi-likelihood and mixture autoregressive models, thus extending its applicability to non-Gaussian models, discrete responses, and dependent data.

Full Text