Block clustering via the block GEM and two-way EM algorithms

M Nadif,G Govaert

doi:10.1109/aiccsa.2005.1387029

Abstract

Summary form only given. Cluster analysis is an important tool in a variety of scientific areas such as pattern recognition, information retrieval, microarray, data mining, and so forth. Although many clustering procedures such as hierarchical clustering, k-means or self-organizing maps, aim to construct an optimal partition on the set of objects I or, sometimes, on the set of variables J, there are other methods, called block clustering methods, which consider simultaneously the two sets and organize the data into homogeneous blocks. These methods are speedy and can process large data sets. They require much less computations than if one works on I and J separately. The mixture model is undoubtedly one of the greatest contributions to clustering. Recently we have proposed a generalized EM algorithm (GEM) to maximize a variational approximation of the likelihood. The proposed algorithm is an iterative algorithm whose steps are carried out by the application of the EM algorithm on intermediate mixture models. This paper focus on the clustering context. It deals to compare block GEM and two-way EM, i.e. EM applied separately on I and J. Results on simulated data are given, confirming that block GEM gives much better performance than two-way EM.

Full Text