Abstract

Finite Mixture of Regressions (FMR) are among the most widely used models for dealing with heterogeneity in regression problems. FMR is a model-based clustering approach that models the data by assigning individual observations to one of the K latent regression clusters. In some applications, it is desired to cluster groups of observations together rather than individually. We present an extension to the regular FMR, that we call the Grouped Mixture of Regressions (GMR), which allows for a known group structure among observations, in addition to their possibly unknown heterogeneity. The research is motivated by a large financial longitudinal dataset from a brand of automotive dealerships across the United States. The task was to cluster the dealers based on their financial performance and also improve performance prediction for individual dealers. We derive a fast and salable algorithm for estimating the model parameters using Expectation–Maximization (EM). We also show how the group structure can improve prediction by sharing information among members of each group, as reflected in the posterior predictive density under GMR. The performance of the approach is assessed using both synthetic data as well as the dealership data. Among our findings are the superior predictive performance of GMR and the ability to correctly detect the number of clusters using simple cross-validation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call