To evaluate the clinical usefulness of a deep learning-based detection device for multiple abnormal findings on retinal fundus photographs for readers with varying expertise. Fourteen ophthalmologists (six residents, eight specialists) assessed 399 fundus images with respect to 12 major ophthalmologic findings, with or without the assistance of a deep learning algorithm, in two separate reading sessions. Sensitivity, specificity, and reading time per image were compared. With algorithmic assistance, readers significantly improved in sensitivity for all 12 findings (P < 0.05) but tended to be less specific (P < 0.05) for hemorrhage, drusen, membrane, and vascular abnormality, more profoundly so in residents. Sensitivity without algorithmic assistance was significantly lower in residents (23.1%∼75.8%) compared to specialists (55.1%∼97.1%) in nine findings, but it improved to similar levels with algorithmic assistance (67.8%∼99.4% in residents, 83.2%∼99.5% in specialists) with only hemorrhage remaining statistically significantly lower. Variances in sensitivity were significantly reduced for all findings. Reading time per image decreased in images with fewer than three findings per image, more profoundly in residents. When simulated based on images acquired from a health screening center, average reading time was estimated to be reduced by 25.9% (from 16.4 seconds to 12.1 seconds per image) for residents, and by 2.0% (from 9.6 seconds to 9.4 seconds) for specialists. Deep learning-based computer-assisted detection devices increase sensitivity, reduce inter-reader variance in sensitivity, and reduce reading time in less complicated images. This study evaluated the influence that algorithmic assistance in detecting abnormal findings on retinal fundus photographs has on clinicians, possibly predicting its influence on clinical application.
Read full abstract