Abstract
Student’s t-test and classical F-test ANOVA rely on the assumptions that two or more samples are independent, and that independent and identically distributed residuals are normal and have equal variances between groups. We focus on the assumptions of normality and equality of variances, and argue that these assumptions are often unrealistic in the field of psychology. We underline the current lack of attention to these assumptions through an analysis of researchers’ practices. Through Monte Carlo simulations, we illustrate the consequences of performing the classic parametric F-test for ANOVA when the test assumptions are not met on the Type I error rate and statistical power. Under realistic deviations from the assumption of equal variances, the classic F-test can yield severely biased results and lead to invalid statistical inferences. We examine two common alternatives to the F-test, namely the Welch’s ANOVA (W-test) and the Brown-Forsythe test (F*-test). Our simulations show that under a range of realistic scenarios, the W-test is a better alternative and we therefore recommend using the W-test by default when comparing means. We provide a detailed example explaining how to perform the W-test in SPSS and R. We summarize our conclusions in practical recommendations that researchers can use to improve their statistical practices.
Highlights
Student’s t-test and classical F-test Analysis of Variance (ANOVA) rely on the assumptions that two or more samples are independent, and that independent and identically distributed residuals are normal and have equal variances between groups
Despite the fact that the F-test is currently used by default, better alternatives exist, such as the Welch’s W ANOVA (W-test), the Alexander-Govern test, James’ second order test, and the Brown-Forsythe ANOVA (F*-test)
We argue that when variances are equal between groups, the W-test has nearly the same empirical Type I error rate and power as the F-test, but when variances are unequal, it provides empirical Type I and Type II error rates that are closer to the expected levels compared to the F-test
Summary
Student’s t-test and classical F-test ANOVA rely on the assumptions that two or more samples are independent, and that independent and identically distributed residuals are normal and have equal variances between groups. When there are more than two groups, the F-test becomes more liberal, meaning that the Type I error rate is larger than the nominal alpha level, even when sample sizes are equal across groups (Tomarken & Serlin, 1986).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.