Abstract Despite the demonstrated potential of artificial intelligence (AI) in breast cancer risk assessment for personalizing screening recommendations, further validation is required regarding AI model bias and generalizability. We performed external validation in Black women of a mammography-driven AI breast cancer risk model (Mirai) originally developed on screening cohorts primarily consisting of White women. In this institutional review board-approved, Health Insurance Portability and Protection Act (HIPAA)-compliant study under a waiver of consent, we retrospectively analyzed a case–cohort sample nested within the core academic breast cancer screening practice of BJC Healthcare, the hospital partner of Washington University in St. Louis. For the purposes of this validation study, relying on 2D digital mammography (DM) images, we focused on Black women presenting for annual DM screening (Selenia or Selenia Dimensions; Hologic) between 2008 and 2018. Eligible breast cancer cases were derived from all women with a breast cancer diagnosis (with associated biopsy-confirmed tumor pathology via institutional cancer registry) after negative (BI-RADS 1 or 2) DM screening 1 to 5 years prior to cancer diagnosis. We also identified a random sample of controls, defined as women who had negative (BI-RADS 1 or 2) DM screening, with 1 to 5 years of screening follow-up without a cancer diagnosis. Risk scores for all DM exams were calculated via the Mirai model. Performance was evaluated using concordance-index (C) analyses and associated 95% confidence intervals (CIs) for the entire cohort, as well as for study subgroups of invasive versus in-situ cancer and cancer molecular subtypes. We analyzed 1368 DM screening exams, including 672 DM exams from 391 women diagnosed with breast cancer (mean age, 58 years; standard deviation, 10 years) and 696 DM exams from 406 controls (mean age, 55 years; standard deviation, 10 years). The overall C-index was 0.62 [95% CIs 0.60–0.64] for all Black women, which was lower compared to previously reported validation results for Mirai in studies of similar design on predominantly White and racially diverse screening cohorts (C-index = 0.67-0.78). There was no evidence of a significant difference between invasive and in-situ cancer (C-index = 0.64 [95% CIs 0.61–0.66] vs. 0.64 [95% CIs 0.61–0.67]). Compared to other cancer molecular subtypes, performance was significantly higher among triple-negative (C-index = 0.67 [95% CIs 0.62–0.71]) and estrogen receptor (ER) negative cancer (C-index = 0.66 [95% CIs 0.61–0.70]). A previously developed mammography-driven AI model showed overall good performance in long-term breast cancer risk assessment in a dataset of Black women only, particularly for triple-negative and ER- cancer types. However, performance was lower compared to previously reported validation results from similar studies on predominantly White and racially diverse screening cohorts. Our results suggest that further refinements are needed towards more accurate breast cancer risk assessment in Black women. Long-term breast cancer risk prediction performance (C-indices) of Mirai in the full cohort of Black women and in study subgroups. Citation Format: Juanita Hernandez Lopez, Zhixin Sun, Shirin Shoushtari, Ulugbek Kamilov, Debbie Bennett, Aimilia Gastounioti. Long-term Breast Cancer Risk Prediction in Black Women: External Validation of a Mammography-driven AI Model [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO2-28-09.
Read full abstract