Use of novel artificial intelligence computer-assisted detection (AI-CAD) for screening mammography: an analysis of 17,884 consecutive two-view full-field digital mammography screening exams.

Sylvia H Heywang-Köbrunner,Alexander Jänsch,Astrid Hacker,Michael Hertlein,Alexander Katalinic,Susanne Elsner,Ruchira Sinnatamby,Christoph Mieskes

doi:10.1177/02841851231187382

Sylvia H Heywang-Köbrunner, Alexander Jänsch + Show 6 more

https://doi.org/10.1177/02841851231187382

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Novel artificial intelligence computer-assisted detection (AI-CAD) systems based on deep learning (DL) promise to support screen reading. To test a DL-AI-CAD system compared to human reading on consecutive screening mammograms. In this retrospective study, 17,884 consecutive anonymized screening mammograms, double-read from January to November 2018, were processed by the DL-AI-CAD system. AI-CAD reading was considered positive if the AI-CAD case scores exceeded 30 (range = 1-100) and the lesion was correctly marked. Likewise, human reading (R1 or R2, respectively) was considered positive if the lesion was correctly identified and called. Receiver operating characteristic (ROC) analysis was performed and accuracy data were calculated. Ground truth for benign lesions: absence of malignancy after cancer registry matching (2022); for malignancy: histopathologic proof; evaluation was patient-based. In total, 114 screen-detected and 17 interval cancers (ICA) occurred. ROC analysis of screen-detected cancers yielded an AUC of 89% for AI-CAD. Sensitivity/specificity was 81.7%/80.2% for AI-CAD; 77.1%/91.7% for R1; 78.6/91.6% for R2. Combining each human reading with AI-CAD was as sensitive as human double-reading (all approximately 88%), but less specific (approximately 75%) compared to human double-reading (approximately 87%). These AI-CAD combinations required consensus readings for twice as many cases as the human combination. Four of 17 ICA exceeded a case score of 30; two of four CAD correctly marked the quadrant of the subsequent ICA. Including ICA cases, this AI-CAD achieved comparable sensitivity to human reading at lower specificity. Combining human reading and AI-CAD allows increasing sensitivity compared to single-reading.

Full Text