Abstract

AbstractMultivariate analysis of variance (MANOVA) and linear discriminant analysis (LDA) apply such well-known criteria as the Wilks’ lambda, Lawley–Hotelling trace, and Pillai’s trace test for checking quality of the solutions. The current paper suggests using these criteria for building objectives for finding clusters parameters because optimizing such objectives corresponds to the best distinguishing between the clusters. Relation to Joreskog’s classification for factor analysis (FA) techniques is also considered. The problem can be reduced to the multinomial parameterization, and solution can be found in a nonlinear optimization procedure which yields the estimates for the cluster centers and sizes. This approach for clustering works with data compressed into covariance matrix so can be especially useful for big data.

Highlights

  • Multivariate analysis of variance (MANOVA) is a well-known generalization of the analysis of variance (ANOVA) extended from one to many dependent variables, and the multivariate analysis of Stan Lipovetsky ABOUT THE AUTHORStan Lipovetsky, PhD, senior research director, GfK, Marketing Sciences

  • When the cluster centers and sizes are estimated, the actual clustering, or assignment of each observation to one or another cluster can be performed by allocating them to the closest cluster

  • Solutions can be found in a nonlinear optimization procedure with the multinomial parameterization which yields estimates for the cluster centers and sizes

Read more

Summary

Introduction

Multivariate analysis of variance (MANOVA) is a well-known generalization of the analysis of variance (ANOVA) extended from one to many dependent variables, and the multivariate analysis of Stan Lipovetsky ABOUT THE AUTHORStan Lipovetsky, PhD, senior research director, GfK, Marketing Sciences. Q=1 with the outer product of vectors of distances from the centers mq for each q-th cluster to the total center M, where each vector mq consists of the means mqj by all the variables, and the vector M contains the total means Mj. For a given matrix Stot (7), the data clustering corresponds to maximizing the distances between the groups and minimizing them within the groups.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.