Abstract

Cancer registries collect information on type of cancer, histological characteristics, stage at diagnosis, patient demographics, initial course of treatment including surgery, radiotherapy, and chemotherapy, and patient survival (Hewitt and Simone 1999). Such information can be valuable for studying the patterns of cancer epidemiology, diagnosis, treatment, and outcome. However, misreporting on registry information is unavoidable, and thus studies based solely on registry data would lead to invalid results. Past literature has documented the inaccuracy of registry records on adjuvant, or supplemental, chemotherapy and radiotherapy. The Quality of Cancer Care (QOCC) project (Ayanian et al. 2003) used data from the the California Cancer Registry, the largest geographically contiguous population-based cancer registry in the world, to study the patterns of receiving and reporting adjuvant therapies for stage II/III colorectal cancer patients. The study surveyed the treating physicians for a subsample of the patients in the registry to obtain more accurate reports of whether they have received adjuvant therapies. This study confirmed the inaccuracy of the registry data in favor of underreporting. Table 1 (line 2 vs line 1), which is based on this study, implies substantial underreporting of 20% and 13% in chemotherapy and radiotherapy rates, respectively. Table 1 Adjuvant Therapy Rates % (SE) Given that the registry is a valuable data source in health services research, how can we improve quality of inferences using the comprehensive but inaccurate registry database? Consider, for example, our goal is to obtain accurate estimates of treatment rates from the misreported records in the registry. A simple approach is to use only the validation sample, i.e. the physician survey data collected in the QOCC project. However, due to logistic reasons, the survey sample ( 12000 patients) used in the study, and hence analyzing the validation sample alone would greatly reduce precision, especially for complex estimands such as regression estimates. Another approach, the errors-in-variables method (Carroll et al. 2006), would analyze the registry data while adjusting for the reporting error. This approach typically involves modeling the relationship between the correct values and misreported ones, represented here by the validation sample and the corresponding registry data. Using information from both sources, it should yield valid results with increased precision. However, the statistical sophistication of the error-adjustment procedures might be challenging for analysts who typically do not possess statistical expertise to implement such methods. A more appealing strategy might be multiple imputation (Rubin 1987). In a typical nonresponse problem, this method first “fills-in” (imputes) missing variables several times to create multiple completed datasets. Analysis can then be conducted for each set using complete-data procedures. The results obtained from separate sets of completed data are combined into a single inference using simple rules. In the problem of misreporting, the essence of applying this strategy is to impute the uncollected correct treatment variables in the remainder of the registry, and then to perform analysis on the completed/corrected data. Figure 1 illustrates this strategy. The corrected registry data can then be used by practitioners without any additional modeling effort. As with the errors-in-variables approach, the imputation model characterizes the measurement error process and makes the adjustment. The imputer may also incorporate additional information which may not generally be available to analysts, such as information from other administrative databases, into the imputation model to further improve the analyses (Yucel and Zaslavsky 2005; Zheng et al. 2006). Figure 1 An illustration of using imputation to correct for underreporting. X is a matrix of covariate variable values with one row for each person in the registry. Y(R) is the matrix of reported treatment status for various treatments. Y(O) is the true treatment. ...

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call