Abstract

The effects of missing values for a confounding variable are investigated in the setting of case-control studies in which, for simplicity, the effect of one binary risk factor and one categoric confounding variable on disease risk is under investigation. Some ad hoc techniques with which to deal with missing values are examined under different assumptions about the missing-data mechanism. Examples are given to illustrate that the magnitude of the bias that is introduced by applying an inadequate procedure can be large under circumstances that occur frequently in empiric research. This is true even for so-called complete case analysis, i.e., when only data on subjects with complete information are used. Appropriate bias corrections are derived. Making use of data on those subjects who are neglected in complete case analysis by creating an additional category always results in biased estimation. An alternative is to allocate these subjects to the cells of the contingency table in an appropriate manner. This approach yields consistent estimates if the data are missing at random. Choosing an appropriate method for dealing with missing values always requires some knowledge of why the data are missing. This suggests that investigators should carry out validation studies to understand whether the missing values occur randomly across the study population or occur more frequently in specific subgroups.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call