Accounting for Misclassification Bias in Binary Outcome Measures of Illness: The Case of Post-Traumatic Stress Disorder in Male Veterans

Elizabeth Savoca

doi:10.1111/j.1467-9531.2011.01239.x

Abstract

The theoretical consequences of measurement error in outcome variables that are continuous are widely known by practitioners, at least for the classical model: purely random errors will lead to a loss of efficiency but not to bias in regression coefficients. When the outcome variable is binary, however, regression coefficients, both linear and nonlinear, will contain bias, even if the measurement error (in this setting more commonly referred to as classification error) is purely random. This paper illustrates a method of correcting for misclassification bias that relies solely on the primary survey data. It is particularly suited to analyses of surveys where external validation of survey responses is unavailable but where there is strong reason to suspect contaminated data. This situation is common in observational studies of the health of populations. The technique is applied to a model of the antecedents of post-traumatic stress disorder (PTSD) using data from a large-scale cross-sectional survey of Vietnam-era veterans. Results show that when adjusted for errors in diagnoses, the sample PTSD prevalence estimate falls significantly; that failure to correct for misclassification in PTSD dramatically understates the effects of risk factors; and that this downward bias remains even when the model incorporates differential classification errors—that is, errors that are correlated with some of the explanatory variables in the model.

Full Text