Introduction: Venous thromboembolism (VTE) significantly affects cancer patients undergoing systemic therapy. Existing risk models developed in the pan-cancer population, such as the Khorana score, offer limited discrimination in cancer-specific setting. We aimed to develop and validate a risk assessment model (RAM) tailored for lymphoma patients, accounting for the clinical and sociodemographic heterogeneity across three distinct healthcare systems in the United States. Methods: Electronic health records (EHR) linked to cancer registry 2006-2021 from the Veterans Affairs national healthcare system (VA) with a total of 10,313 lymphoma patients were randomly divided into an 80% derivation cohort and a 20% internal validation cohort. Further external validation was carried out using data from two other healthcare systems - Harris Health System (HHS, N= 854, 2011-2020) and MD Anderson Cancer Center (MDACC, N=1,858, 2017-2020). Patients were included if they had newly diagnosed lymphoma requiring first-line systemic therapy within 1 year of diagnosis. Patients were excluded if they had recent diagnosis of acute VTE within the last 6 months or were prescribed anticoagulant within 1 month before index date. The index date was the time of systemic therapy, and all covariates were extracted on or before the index date. Incident VTE was defined using our published computable phenotype algorithm (PMID 36626707; 37067102). For RAM derivation (Figure 1), 32 candidate predictors were initially chosen based on clinical relevance and EHR data availability. Missing values were imputed via chain random forest. Parsimonious variable categories were selected by a combination of Least Absolute Shrinkage and Selection Operator (LASSO), Recursive Feature Elimination (RFE), and random forest to generate the final logistic regression model. For external validation, complete case analysis was performed using the derived beta coefficients without model refitting. Bootstrapped time-dependent c statistics and calibration curves for 6-month VTE were used to assess discrimination and fit. A pre-determined threshold of 8% VTE at 6-month from the predicted probability was used to stratify high vs. low-risk groups. Competing risk models were used to assess cumulative incidence. Results: At 6 months, the VTE incidence were 5.75% (n=469) in VA derivation and 6.12% (n=124), 8.24% (n=69), and 6.55% (n=112) in the VA, HHS, and MDACC validation cohorts, respectively. The three healthcare systems had different baseline characteristics in demographics, lymphoma types, systemic therapies, and comorbidities. Patients from VA were mostly older males with more comorbidities and remote history of VTE, patients from HHS were mostly uninsured minorities who were younger but with more aggressive histology, while those from MDACC were patients treated at a comprehensive cancer center. The variables in the final RAM included lymphoma type (4 groups ranging from very-low to very-high risk), body mass index, indwelling catheter, cytotoxic vs. targeted therapy, history of VTE, recent hospitalization, and paralysis/immobilization. Table 1 shows the variable prevalence in each cohort. As shown in Figure 1, the new RAM showed promising discrimination for VTE with a C-statistic of 0.69 in the derivation cohort and 0.67, 0.70, 0.72 in the internal and two external validation cohorts. By splitting patients using an 8% predicted risk threshold, the high- vs. low-risk group achieved a clear differentiation in risk stratification despite significant differences in baseline characteristics. This represented a noticeable improvement for lymphoma patients over the Khorana Score, which yielded C-statistics of 0.53-0.56. The overall calibration of the RAM appeared adequate. Conclusions: Using multiple cancer-EHR cohorts with heterogeneous patients and a wide range of lymphoma subtypes, we successfully derived and externally validated an improved RAM for VTE specifically in lymphoma patients requiring early initiation of systemic therapy. Along several well-known patient-specific risk factors, lymphoma type was an important predictor of VTE risk. The new RAM could improve VTE risk stratification and prevention efforts in lymphoma patients.
Read full abstract