Oncology databases that integrate genomic and clinical data have become valuable resources for precision medicine. However, the generalizability of these databases has not been comprehensively assessed. To describe the demographics, clinical characteristics, treatments, and overall survival of breast cancer cohorts in GENIE-BPC and three other databases. This study utilized GENIE-BPC, SEER, SEER-Medicare, and Merative MarketScan Research Databases. Women with invasive breast cancer were identified through EHR, cancer registries or ICD-9/10-CM codes. The ages were 18+ years or per database requirement. Treatments were based on EHR or HCPCS/NDC codes in claims. Overall survival was estimated as time from diagnosis to death. Of female breast cancer patients in GENIE-BPC (n = 775), SEER (n = 548 336), SEER-Medicare (n = 68 914), and Marketscan (n = 109 499) databases, the median ages at initial diagnosis were 44, 62, 74, and 57 years, respectively. A greater proportion of patients in GENIE-BPC, compared to SEER/SEER-Medicare, had higher nuclear grades (%III-%IV: 57% vs. 26%/24%), advanced disease stage (%IV: 25.3% vs. 5%/3.6%), percent of triple negative breast cancer (19.7% vs. 10.2%/8.5%), and receipt of chemotherapy (85.0% vs. NA/22.3%). The 1-, 3-, and 5-year overall survival rates were lower in GENIE-BPC (78.5%, 60.5%, 55.5%) than in SEER (95.8%, 89.5%, 85.5%) and SEER-Medicare (91.6%, 81.4%, 75.0%). Breast cancer patients in GENIE-BPC were younger, had more advanced disease, had a higher proportion of triple negative breast cancer and recipients of chemotherapy, and had poorer overall survival. Researchers must use statistical adjustment when extrapolating results (e.g., biomarker prevalence) from GENIE-BPC to the larger breast cancer population.
Read full abstract