Abstract

Linked cancer registry and medical claims data have increased the capacity for cancer research. However, few efforts have described methods to select information between data sources, which may affect data use. We developed a systematic process to evaluate and consolidate cancer diagnosis and treatment information between the linked Department of Defense Central Cancer Registry (CCR) and Military Health System Data Repository (MDR) administrative claims database, called Military Cancer Epidemiology Data System (MilCanEpi). MilCanEpi contains information on cancer diagnosis and treatment of patients receiving care from 1998 to 2014. We used an iterative process guided by knowledge of data features, current literature, and logical comparisons between the CCR and MDR data to evaluate and consolidate cancer diagnosis and treatment received (yes or no) and their dates. We applied the processes to breast cancer data as an example. Agreement between diagnosis and treatment dates in the two data sources was evaluated using Cohen's κ with 95% CIs. In MilCanEpi, we identified 15,965 patients with a breast cancer diagnosis and 15,145 patients who underwent breast cancer surgery; 97.9% and 84.1% of patients had records in both CCR and MDR for diagnosis and surgery, respectively. Exact agreement was 13.7% for diagnosis dates (Cohen's κ = 0.14; 95% CI, 0.13 to 0.14) and 68.9% for surgery dates (Cohen's κ = 0.69; 95% CI, 0.68 to 0.70) between the two data sources. After applying systematic processes, 98.1% of patients with a breast cancer diagnosis and 99.7% of patients with surgery had information selected for analytic data sets. The developed processes resulted in high consolidation rates of breast cancer data in MilCanEpi and may serve as a data selection template for other tumor sites and linked data sources.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call