Abstract

Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster data sets; see, for example, (2000a). We consider the use of normal mixture models to cluster data sets of continuous multivariate data, concentrating on some of the associated computational issues. A robust version of this approach to clustering is obtained by modelling the data by a mixture of t distributions (Peel and McLachlan, 2000). The normal and t mixture models can be fitted by maximum likelihood via the EM algorithm, as implemented in the EMMIX software of the authors. We report some recent results of (2000) on speeding up the fitting process by an an incremental version of the EM algorithm. The problem of clustering high-dimensional data by use of the mixture of factor analyzers model (McLachlan and Peel, 2000b) is also considered. This approach enables a normal mixture model to be fitted to data which have high dimension relative to the number of data points to be clustered.KeywordsHuman Mammary Epithelial CellFactor Analyzer ModelFinite Mixture ModelNormal Mixture ModelProbabilistic Principal Component AnalyserThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.