Abstract

KEY POINT: Poisson regression and its extensions are used to estimate the relationship between 1 or more independent (predictor) variables and a count outcome variable.In this issue of Anesthesia & Analgesia, Nafiu et al1 report the results of a retrospective observational study on the association of preoperative pneumonia with postsurgical morbidity and mortality in children undergoing inpatient surgery. While their primary outcome (time to mortality) was assessed with Cox proportional hazards regression—discussed in a previous Statistical Minute2—these authors analyzed length of postoperative mechanical ventilation with 0-inflated Poisson regression. In medical research, investigators are often interested in countable quantities such as days of mechanical ventilation, hospital length of stay, or the number of postoperative infections. The values of such variables are always nonnegative whole numbers, and the distribution is often markedly skewed with many values being relatively low and few being high. For example, most patients are ventilated for 0–2 days postoperatively and only a few for >2 days. Such data violate the assumptions for statistical methods traditionally used for continuous variables such as linear regression3 and related techniques (eg, t tests).4 Poisson regression is the prototypical approach to the analysis of count data, and there are several available extensions of Poisson regression—including 0-inflated Poisson regression applied by Nafiu et al.1 Poisson regression is a generalization of linear regression. As with other regression techniques,3 the Poisson model can accommodate ≥1 independent (predictor) variables. It can be used to describe the independent relationship of variables on the count outcome while holding constant the values of other variables (eg, to control for confounding in observational research5); to test hypotheses about relationships; and to predict the outcome based on a set of predictor variables. However, rather than modeling a linear relationship with the expected value (mean) of the outcome as in linear regression, Poisson regression assumes a linear relationship between the independent variable(s) and the natural logarithm (ln) of the expected value of the outcome (E[y]). In the most simple case with only 1 independent (predictor) variable, the regression coefficients represent the intercept (b0) and slope (b1) of this line: At first glance, this looks very similar to the often-used approach of log-transforming the outcome variable in an attempt to approximate a normal (Gaussian) distribution then using the transformed outcome variable in a linear regression model. The difference, however, is that in Poisson regression, the parameter of interest (expected value of the outcome) is logarithmized, not the actual outcome values. The exponentiated slope regression coefficients in Poisson regression and other similar count models discussed below can conveniently be interpreted as an estimate of a ratio of arithmetic means. Count data are often an enumeration of events occurring per unit of time, area, or population (eg, intraoperative cardiac arrests per 100,000 operations). Thus, the exponentiated regression coefficient can also—depending on the context—represent the ratio of 2 event rates or incidence rates, and it is hence commonly referred to as the incidence rate ratio (IRR). The IRR describes the estimated multiplicative change in the rate (or estimated multiplicative change in the mean count, when analyzing simple counts instead of rates) for each 1-unit increase in a continuous independent variable, or versus a reference category for a categorical independent variable. For example, the IRR estimate of 1.47 reported by Nafiu et al1 indicates that patients with preoperative pneumonia were estimated to have a duration of ventilation that was on average 1.47 times (or 47%) longer than patients without preoperative pneumonia. Poisson regression assumes that the conditional distribution of the outcome is a Poisson distribution (Figure). In this distribution, the mean value and the variance are equal. However, in real-world count data, the conditional variance often exceeds the conditional mean, which is referred to as “overdispersion.” Negative binomial regression is similar to Poisson regression but allows for overdispersion. In fact, Poisson regression is a special case of negative binomial regression, and both give the same results in the absence of overdispersion. However, negative binomial regression is more appropriate in the presence of overdispersion, and therefore we generally recommend using negative binomial regression instead of Poisson regression as the default method when analyzing count data.Figure.: Examples of probability mass functions of the Poisson distribution. Note that the Poisson distribution only has 1 parameter, λ, which is the mean value as well as the variance. As shown here, the spread of the values around the mean increases with increasing λ. Poisson regression assumes that the outcome—conditional on the value(s) of the independent variable(s)—follows a Poisson distribution, and models a linear relationship between the independent variable(s) and the natural logarithm of the mean value.Moreover, the number of observed 0 counts in real-world count data often markedly deviates from the number expected based on the Poisson distribution. For example, in the study by Nafiu et al,1 many patients apparently did not require any postoperative ventilation, leading to an excessive number of 0s. On the contrary, 0 counts can sometimes be impossible, for example, when studying hospital length of stay in hospitalized patients. In such scenarios, 0-inflated and 0-truncated Poisson regression or 0-inflated and 0-truncated negative binomial models are available. Another assumption of Poisson regression is that the observations are independent of one another; however, violation of this assumption can be addressed by specifying the model within the framework of generalized linear mixed effects models.6 Thus, even when key assumptions of Poisson regression regarding the distribution and independence of data are not satisfied, count data can often still be appropriately analyzed by one of the numerous modifications and extensions of Poisson regression, as was done by Nafiu et al.1

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call