Specifically addressing different customer segments, e.g., via revenue management or customer relationship management, lets firms optimise their market response. Identifying such segments requires analysing large amounts of transactional data. To that end, we present a nonparametric approach to estimate the number of customer segments from censored panel data recording sales. As the approach models customer segments and choices via a general finite mixture model, it applies to a diverse range of settings. We evaluate several model selection criteria and imputation methods to compensate for censored observations under different demand scenarios. We measure estimation performance in a controlled environment via simulated data samples, benchmark it to common clustering indices and imputation methods, and analyse empirical data sample to validate practical applicability. The proposed algorithm outperforms all benchmarked cluster segmentation indices. In terms of model selection criteria, we find that the Baysian information criterion works best for uncensored panel data, whereas sequential hypothesis testing is better suited for censored data. For imputation methods, we find that simple rules outperform the nonparametric k-nearest neighbours algorithm and a multiple imputation method with additive regression and bootstrapping while being computationally much faster.
Read full abstract