Abstract Ecologists and evolutionary biologists are regularly tasked with the comparison of binary data across groups. There is, however, some discussion in the biostatistics literature about the best methodology for the analysis of data comprising binary explanatory and response variables forming a 2 × 2 contingency table. We assess several methodologies for the analysis of 2 × 2 contingency tables using a simulation scheme of different sample sizes with outcomes evenly or unevenly distributed between groups. Specifically, we assess the commonly recommended logistic (generalised linear model [GLM]) regression analysis, the classical Pearson chi‐squared test and four conventional alternatives (Yates' correction, Fisher's exact, exact unconditional and mid‐p), as well as the widely discouraged linear model (LM) regression. We found that both LM and GLM analyses provided unbiased estimates of the difference in proportions between groups. LM and GLM analyses also provided accurate standard errors and confidence intervals when the experimental design was balanced. When the experimental design was unbalanced, sample size was small, and one of the two groups had a probability close to 1 or 0, LM analysis could substantially over‐ or under‐represent statistical uncertainty. For null hypothesis significance testing, the performance of the chi‐squared test and LM analysis were almost identical. Across all scenarios, both had high power to detect non‐null effects and reject false positives. By contrast, the GLM analysis was underpowered when using z‐based p‐values, in particular when one of the two groups had a probability near 1 or 0. The GLM using the LRT had better power to detect non‐null results. Our simulation results suggest that, wherever a chi‐squared test would be recommended, a linear regression is a suitable alternative for the analysis of 2 × 2 contingency table data. When researchers opt for more sophisticated procedures, we provide R functions to calculate the standard error of a difference between two probabilities from a Bernoulli GLM output using the delta method. We also explore approaches to compliment GLM analysis of 2 × 2 contingency tables with credible intervals on the probability scale. These additional operations should support researchers to make valid assessments of both statistical and practical significances.
Read full abstract