Abstract
AbstractMultivariate analysis of variance (MANOVA) and linear discriminant analysis (LDA) apply such well-known criteria as the Wilks’ lambda, Lawley–Hotelling trace, and Pillai’s trace test for checking quality of the solutions. The current paper suggests using these criteria for building objectives for finding clusters parameters because optimizing such objectives corresponds to the best distinguishing between the clusters. Relation to Joreskog’s classification for factor analysis (FA) techniques is also considered. The problem can be reduced to the multinomial parameterization, and solution can be found in a nonlinear optimization procedure which yields the estimates for the cluster centers and sizes. This approach for clustering works with data compressed into covariance matrix so can be especially useful for big data.
Highlights
Multivariate analysis of variance (MANOVA) is a well-known generalization of the analysis of variance (ANOVA) extended from one to many dependent variables, and the multivariate analysis of Stan Lipovetsky ABOUT THE AUTHORStan Lipovetsky, PhD, senior research director, GfK, Marketing Sciences
When the cluster centers and sizes are estimated, the actual clustering, or assignment of each observation to one or another cluster can be performed by allocating them to the closest cluster
Solutions can be found in a nonlinear optimization procedure with the multinomial parameterization which yields estimates for the cluster centers and sizes
Summary
Multivariate analysis of variance (MANOVA) is a well-known generalization of the analysis of variance (ANOVA) extended from one to many dependent variables, and the multivariate analysis of Stan Lipovetsky ABOUT THE AUTHORStan Lipovetsky, PhD, senior research director, GfK, Marketing Sciences. Q=1 with the outer product of vectors of distances from the centers mq for each q-th cluster to the total center M, where each vector mq consists of the means mqj by all the variables, and the vector M contains the total means Mj. For a given matrix Stot (7), the data clustering corresponds to maximizing the distances between the groups and minimizing them within the groups.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.