Hirshon and Wilson raise important questions about the statistical methodology underlying part of our approach to estimating the burden of infectious disease among inmates and releasees. Our approach involved estimating the number of unique individuals who entered jails during a given year. Given the rapid turnover of jail populations, we assumed that the number of entrants would be roughly equivalent to the number of releasees. Although the total number of entries is countable, the number of unique individuals who entered had to be inferred from arrest data. For that purpose, we assumed and estimated a stochastic model (using a negative binomial regression) of the arrest process and used it to infer that each person who entered jail at least once during a year in fact entered, on average, about 1.38 times that year. Hirshon and Wilson question the use of a negative binomial regression for generating this estimate. Researchers frequently use the Poisson and negative binomial distributions to model arrest histories.1 This is partly a matter of convenience, because those distributions are especially tractable in the study of repeated events, but also because those distributions seem to fit arrest data particularly well. Furthermore, both the Poisson and negative binomial models give consistent estimates, provided the mean function is correctly specified, even when the maintained distribution is false.2 Hirshon and Wilson should rest assured that our estimates are reasonably robust with respect to distributional assumptions about the arrest process. We were more concerned about data quality than about statistical modeling. A reader should take the estimated arrest (and, by inference, release) rate as a rough approximation of the true rate. A recent report, not available at the time we wrote our article, explains the intricacies of using a negative binomial model to estimate arrest rates using samples collected at booking facilities (Rhodes W, Kling R, “Estimating prevalence of hard to reach populations,” unpublished paper, 2002). Using probability-based survey data from 35 counties, that report provides statistics leading to a consensus adjustment of 1.42 arrests per year for chronic drug users. Had we used that consensus adjustment in place of the 1.38 used in our article, our estimates of the burden of infectious disease would have been only about 3% lower. This is probably a lower limit, because chronic drug users have comparatively high arrest rates. On the other hand, had we assumed that every arrestee (and by inference, every releasee) was a unique individual, our estimates of the burden of infectious diseases among inmates would have been about 38% higher, but surely this is an unrealistic upper limit, because many people cycle repeatedly through jails. In sum, any errors resulting from use of the negative binomial distribution to model the arrest process are unlikely to have a material effect on the conclusions reached in our article.
Read full abstract