Abstract
Investigating differences between means of more than two groups or experimental conditions is a routine research question addressed in biology. In order to assess differences statistically, multiple comparison procedures are applied. The most prominent procedures of this type, the Dunnett and Tukey-Kramer test, control the probability of reporting at least one false positive result when the data are normally distributed and when the sample sizes and variances do not differ between groups. All three assumptions are non-realistic in biological research and any violation leads to an increased number of reported false positive results. Based on a general statistical framework for simultaneous inference and robust covariance estimators we propose a new statistical multiple comparison procedure for assessing multiple means. In contrast to the Dunnett or Tukey-Kramer tests, no assumptions regarding the distribution, sample sizes or variance homogeneity are necessary. The performance of the new procedure is assessed by means of its familywise error rate and power under different distributions. The practical merits are demonstrated by a reanalysis of fatty acid phenotypes of the bacterium Bacillus simplex from the “Evolution Canyons” I and II in Israel. The simulation results show that even under severely varying variances, the procedure controls the number of false positive findings very well. Thus, the here presented procedure works well under biologically realistic scenarios of unbalanced group sizes, non-normality and heteroscedasticity.
Highlights
Many research projects in Life Sciences employ comparative studies [1,2,3,4,5]
With unequal variances and higher variances in the larger groups for both normal or nonnormal data, the Tukey-Kramer test is conservative while the estimated familywise error rate of the max-t test using the heteroscedastic consistent covariance estimation is close to a~0:05 already for a total sample size of N~60
The familywise error rate of the max-t test using the consistent covariance estimation is liberal for a total sample size of N~60 but close to a~0:05 with increasing total sample size N
Summary
Many research projects in Life Sciences employ comparative studies [1,2,3,4,5]. For example, biodiversity exploration such as in population genetics measures the properties of individuals belonging to different groups. For many statistically less well trained users it is hard to verify to which extent statistical procedures for comparing means are based on theoretical assumptions such as normality or homoscedasticity, i.e. homogeneous or equal variances among all groups. This may lead to misapplication of tests, which is often not even detected by reviewers or editors. All previously suggested parametric procedures for comparisons of means, such as the methods by Tukey [6] and Dunnett [7], require homogeneous variances among all groups. No methods for multiple pairwise comparisons of means in presence of heteroscedasticity and potentially unequal sample sizes in the groups exist so far
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.