1556 Background: Mammography-based deep learning (DL) risk stratification models improve discriminatory accuracy in predicting future breast cancer compared to traditional risk models; however, decisions for and insurance payment of supplemental services such as MRI are driven by traditional scores. The purpose of this study was to compare risk scores and cancer detection rates in patients identified as increased risk by DL vs traditional models in a large screening mammography cohort. Methods: This multisite study included consecutive patients ≥40y undergoing routine bilateral screening mammography from 9/2017 to 1/2022 at five facilities. Tyrer-Cuzick version 8 (TC) and National Cancer Institute Breast Cancer Risk Assessment Tool (BCRAT) 5-year and lifetime models and a DL 5-year model were used to assess risk. The following thresholds were used to define intermediate risk: ≥1.67% for TC and BCRAT 5-year, 2.2 for DL model and high risk: ≥20% for TC and BCRAT lifetime, ≥6.0 for DL model. Patients were included if all risk scores were available. Patient demographics were retrieved from electronic medical records and cancer outcomes through regional tumor registry linkage. The proportion of increased-risk patients and cancer detection rates (CDR [cancers per 1000 women screened]) across models were compared using Pearson’s Chi-square test. Results: 148476 exams in 69464 patients (mean age 58.0y [IQR: 50.0-67.0]) were performed. 81.7% (119386/146075) of exams were in White patients and 18.3% (26689/146075) in races other than White. CDR of those classified as intermediate risk by DL was 7.1 (450/63109) vs 4.7 (320/68545; P<0.001) by TC and 4.2 (300/70831; P<0.001) by BCRAT, and as high risk was 20.6 (305/14772) by DL vs 3.9 (64/16442; P<0.001) by TC and 5.0 (29/5782; P<0.001) by BCRAT. 13.0% (19248/148476) were classified as intermediate risk by DL but not by TC or BCRAT, with a CDR of 6.0 in this cohort. 32.0% (47435/148476) were classified as intermediate risk by TC or BCRAT but not by DL with a CDR of 1.1. 8.5% (12577/148476) were classified as high risk by DL but not by TC or BCRAT with a CDR of 20.8. 11.1% (16451/148476) were classified as high risk by TC or BCRAT but not by DL with a CDR of 1.9. Conclusions: A significant proportion of patients identified as increased risk by DL model are not assessed as increased risk by commonly used risk models, effectively excluding them from accessing potentially life-saving supplemental services. DL scores met the American College of Radiology's acceptable CDR on screening mammography (≥2.5) regardless of traditional scores, however, elevated traditional scores in the absence of elevated DL scores did not meet acceptable standards. Risk assessment guidelines incorporating DL models are necessary to capture patients who benefit the most.
Read full abstract