Abstract

Many biological variables are distributed geometrically (proportionally) rather than arithmetically, and they are lognormal rather than normal on the usual arithmetic scale of measurement. The distinction is important because it affects statistical interpretation at many levels: for example, logarithmic transformations commonly used in biology to standardize variances and linearize relationship of arithmetic measurements will skew underlying distributions if these are inherently arithmetic-normal but not if they are geometric-normal. The purpose of this study is to determine theoretically, using Monte Carlo simulation, which of a range of recommended tests of normality has greatest power, and what sample size is required to distinguish geometric-normal from arithmetic-normal distributions when inherent variability is low, as it is in most biological distributions. Lilliefors’ version of the Kolmogorov-Smirnov test, Frosini’s test, and the Anderson-Darling test are three non-parametric goodness-of-fit tests of normality based on an observed empirical distribution function. When inherent variability Vis on the order of 0.10 (standard deviation 10% of mean), Lilliefors’ test requires a minimum sample of about 2200 to correctly distinguish lognormal distributions from normal 95% of the time (with the level of significance or type I error rate α and the type II error rate β both 0.05). In the same situation, Frosini’s test requires a minimum sample of about 1700; the Anderson-Darling test is more powerful, but still requires a minimum sample of about 1500. Power is sensitive to inherent variability: when V= 0.05 the Anderson-Darling test requires a minimum sample much greater than 2500, but when V= 0.15 it requires a minimum sample of only about 650. Sensitivity of the power of all tests to inherent variability means that the normality of body measurements like weight with Vtypically ≈0.15 is more easily tested than the normality of body lengths with Vtypically ≈0.05 in the same sample. Inherent variability must be considered in designing empirical tests of normality, and differences in inherent variability must be considered in interpreting results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.