Abstract

KEY POINT: A χ2 test is commonly used to analyze categorical data, but valid statistical inferences rely on its test assumptions being met.In this issue of Anesthesia & Analgesia, Sharkey et al1 report a randomized trial comparing the incidence of bradycardia after phenylephrine versus norepinephrine to prevent and treat spinal-induced hypotension in women undergoing cesarean delivery with spinal anesthesia. The authors used a chi-square (χ2) test to compare the groups and observed a lower incidence of bradycardia in the norepinephrine group. A χ2 test commonly either compares the distribution of a categorical variable to a hypothetical distribution or tests whether 2 categorical variables are independent. We focus here on the Pearson χ2 test of independence used by Sharkey et al.1 The Pearson χ2 test evaluates the null hypothesis that 2 categorical variables (eg, treatment group [norepinephrine versus phenylephrine] and outcome [bradycardia versus no bradycardia]) are not associated with each other.2 In the study by Sharkey et al,1 a total of 27/112 (24.1%) patients developed bradycardia (Figure). Assuming independence between treatment and bradycardia, the same percentage would be expected in each group (ie, 13.5 patients with bradycardia per group). The χ2 test then compares the observed to expected frequencies. The reported P value of .001 suggests that it is very unlikely to observe a difference this large or larger if the null hypothesis was true,3 supporting the conclusion that there is an association between treatment and outcome. While in this example, both categorical variables have 2 levels (2 groups, 2 outcomes), a χ2 test can more generally be used when variables have multiple categories.Figure.: Contingency table with data from Sharkey et al1 showing the observed and expected counts (number of patients) with and without bradycardia per treatment group. Assuming that the probability of developing bradycardia is independent of the group allocation (null hypothesis), 13.5 patients with bradycardia would be expected in each group (given a total of 27 patients who developed bradycardia and equal sample size in both groups). Pearson χ2 test compares observed to expected frequencies.However, valid inferences with a χ2 test rely on a number of assumptions,2 including: The actual frequencies can be crosstabulated in a contingency table. It is not appropriate to use a χ2 test for percentages or other derived statistics. The 2 variables are nominal—there is no natural ordering of the categories. The observations are independent. The expected count or frequency is ≥5 in more than 75%–80% of the cells in the contingency table, and there is no expected cell count of 0. When analyzing ordinal data, the Pearson χ2 test ignores the order (assumption #2), and the Mantel–Haenszel χ2 test provides more power to test for an association.2 Repeated measurements in the same subjects are a common violation of assumption #3. Such data require tests that account for the pairing (eg, McNemar test) or longitudinal models that account for the within-subject correlation.4 When expected counts are lower than specified in assumption #4, Fisher exact test can be used.2 Importantly, the χ2 test assesses for an association but does not provide information on the strength of the association or on whether the relationship is causal. While a properly conducted randomized controlled trial allows causal inferences, observed associations can be confounded in uncontrolled studies. In such cases, techniques that control for confounding, such as multivariable logistic regression,5 are strongly preferred.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call