Determining the Time of Cancer Recurrence Using Claims or Electronic Medical Record Data.

Hajime Uno,Michael J Hassett,Angel M Cronin,Nikki M Carroll,Mark C Hornbrook,Debra P Ritzwoller

doi:10.1200/cci.17.00163

Abstract

Data from claims and electronic medical records (EMRs) are frequently used to identify clinical events (eg, cancer diagnosis, stroke). However, accurately determining the time of clinical events can be challenging, and the methods used to generate time estimates are underdeveloped. We sought to develop an approach to determine the time of a clinical event-cancer recurrence-using high-dimensional longitudinal structured data. Manual chart abstraction provided information regarding the actual time of cancer recurrence. These data were linked to claims from Medicare or structured EMR data from the Cancer Research Network, which were used to determine time of recurrence for patients with lung or colorectal cancer. We analyzed the longitudinal profile of codes that could help determine the time of recurrence, adjusted for systematic differences between code dates and recurrence dates, and integrated time estimates from different codes to empirically derive an optimal algorithm. We identified twelve code groups that could help determine the time of recurrence. Using claims data for patients with lung cancer, the optimal algorithm consisted of three code groups and provided an average prediction error of 4.8 months. Using EMR data or applying this approach to patients with colorectal cancer yielded similar results. Time estimates were improved by selecting codes not necessarily the same as those used to identify recurrence, combining time estimates from multiple code groups, and adjusting for systematic bias between code dates and recurrence dates. Improving the accuracy of time estimates for clinical events can facilitate research, quality measurement, and process improvement.

Full Text