Introduction Transfusion data completeness is critical in assessing real-world response and progression in myeloid malignancies. In the real-world setting transfusion data can be collected via insurance claims or by Electronic Health Record (EHR) abstraction. Transfusion events captured from claims have previously been shown to be 71% sensitive and 92% specific compared to EHR abstraction in the community hospital setting, however this has not been evaluated in outpatient community oncology setting (Howard DH, et al. Transfusion Medicine. 2016;26:457-459). Our objective was to assess the completeness of transfusion data from claims in the The Komodo Healthcare Map™(KHM) to EHR abstraction in the Flatiron Health Research Database (FHRD) in a group of patients with myelofibrosis (MF). Methods This study used the nationwide Flatiron Health EHR-derived de-identified database and The Komodo Healthcare Map™, a database of de-identified claim-based healthcare encounters from insured patients in the US (2015-2022).This is a cross-sectional study of a cohort of patients with MF that underwent EHR abstraction (EHRC) (n = 507) from FHRD, capturing transfusion dates, and number of units from date of diagnosis up to data cutoff on 30 November 2022 (observation window). Those patients were linked to KHM to create a Linked Claims Cohort (LCC), and a sub-cohort of patients with any closed claims during the observation window to make a Closed Claims Cohort (CCC), where transfusions were assessed by reviewing transfusion-related HCPCS and ICD codes. Analyses were performed on the matched patients in the LCC and CCC, respectively. To assess the level of concordance between claims and EHR data sources on the number of transfusions per patient the intraclass correlation coefficient (ICC) was used. Also, percent overlap and sensitivity, specificity, and positive and negative predictive values were calculated. Results Out of a total of 507 in the EHRC, 498 were found in KHM and included in the LCC, and 245 had closed claims and comprised the CCC. Demographics are listed below (Table 1) and are similar between groups, however the CCC had a higher percentage of patients with more recent diagnosis. A total of 319 (64%) vs 168 (34%) in the LCC, and 143 (58%) vs 95 (39%) patients in the CCC had at least one transfusion identified by EHR and claims data sources, respectively (Table 2). In the LCC, there was 64.1% agreement of having a transfusion across data sources, with a kappa of 0.34 (p = 0), and in the CCC, there was 72.24% agreement, with a kappa of 0.47 (p = 0). The unweighted kappa for the number of transfusions recorded during the observation period was 0.11 (p = 0) in the LCC and 0.18 (p = 0) in the CCC. Treating the EHR source as the source of truth, the claims data source in LCC had a sensitivity of 48.28%, specificity of 92.18%, PPV of 91.67%, and NPV of 50% to identify transfusion events, while the claims in CCC had a sensitivity of 59.44%, specificity of 90.20%, PPV of 89.47%, and NPV of 61.33% Conclusions There was an observed low agreement between transfusions identified via EHR abstraction and Claims data sources, with the EHR having more transfusions identified, but a moderate agreement when selecting for patients with closed claims. As a source for transfusion data EHR provided more data points during the observation window for this group of patients. Future investigation of transfusions in patients with continuous closed claims during the observation window is warranted to determine if it improves data capture. This study demonstrates that EHR abstraction may have higher utility for identifying transfusions with lower attrition in a cohort of MF patients with long observation periods, and with variable insurance coverage in the outpatient oncology setting.