Violating the normality assumption may be the lesser of two evils

Ulrich Knief,Wolfgang Forstmeier

doi:10.3758/s13428-021-01587-5

Abstract

When data are not normally distributed, researchers are often uncertain whether it is legitimate to use tests that assume Gaussian errors, or whether one has to either model a more specific error structure or use randomization techniques. Here we use Monte Carlo simulations to explore the pros and cons of fitting Gaussian models to non-normal data in terms of risk of type I error, power and utility for parameter estimation. We find that Gaussian models are robust to non-normality over a wide range of conditions, meaning that p values remain fairly reliable except for data with influential outliers judged at strict alpha levels. Gaussian models also performed well in terms of power across all simulated scenarios. Parameter estimates were mostly unbiased and precise except if sample sizes were small or the distribution of the predictor was highly skewed. Transformation of data before analysis is often advisable and visual inspection for outliers and heteroscedasticity is important for assessment. In strong contrast, some non-Gaussian models and randomization techniques bear a range of risks that are often insufficiently known. High rates of false-positive conclusions can arise for instance when overdispersion in count data is not controlled appropriately or when randomization procedures ignore existing non-independencies in the data. Hence, newly developed statistical methods not only bring new opportunities, but they can also pose new threats to reliability. We argue that violating the normality assumption bears risks that are limited and manageable, while several more sophisticated approaches are relatively error prone and particularly difficult to check during peer review. Scientists and reviewers who are not fully aware of the risks might benefit from preferentially trusting Gaussian mixed models in which random effects account for non-independencies in the data.

Highlights

In the biological, medical, and social sciences, the validity or importance of research findings is generally assessed via statistical significance tests
We argue that pseudoreplication is a well-known problem that has been solved reasonably well within the framework of mixed models, and the consideration or neglect of essential random effects can be readily judged from tables that present the model output
But consistent with a considerable body of literature (Ali & Sharma, 1996; Box & Watson, 1962; Gelman & Hill, 2007; Lumley et al, 2002; Miller, 1986; Puth et al, 2014; Ramsey & Schafer, 2013; Schielzeth et al, 2020; Warton et al, 2016; Williams et al, 2013; Zuur et al, 2010), we find that violations of the normality of residuals assumption are rarely problematic for hypothesis testing and parameter estimation, and we argue that the commonly recommended solutions may bear greater risks than the one to be solved

Summary

Introduction

Medical, and social sciences, the validity or importance of research findings is generally assessed via statistical significance tests. We used these data both as our dependent variable Y and as our predictor variable X in linear regression models, yielding 10 × 10 = 100 combinations of Y and X for each sample size (see Fig. S1 for distributions of the independent variable Y, the predictor X, and residuals).

Results

Conclusion