Parameter-wise co-clustering for high-dimensional data

M P B Gallaugher,C Biernacki,P D Mcnicholas

doi:10.1007/s00180-022-01289-2

Abstract

In recent years, data dimensionality has increasingly become a concern, leading to many parameter and dimension reduction techniques being proposed in the literature. A parameter-wise co-clustering model, for (possibly high-dimensional) data modelled via continuous random variables, is presented. The proposed model, although allowing more flexibility, still maintains the very high degree of parsimony and interpretability achieved by traditional co-clustering. More precisely, the keystone consists of dramatically increasing the number of column-clusters while expressing each as a combination of a limited number of mean-dependent and variance-dependent column-clusters. A stochastic expectation-maximization algorithm along with a Gibbs sampler is used for parameter estimation and an integrated complete log-likelihood criterion is used for model selection. Simulated and real datasets are used for illustration and comparison with traditional co-clustering.

Full Text