Automated diabetic retinopathy (DR) screening using artificial intelligence has the potential to improve access to eye care by enabling large-scale screening. However, little is known about differences in real-world performance between available algorithms. This study compares the diagnostic accuracy of two AI screening platforms, IDx-DR and RetCAD, for detecting referable diabetic retinopathy (RDR). Retinal images from 758 patients with diabetes were collected during screening from various clinics in Poland. Each patient was graded by three graders with 320 patients graded by Polish and 438 patients graded by Indian graders, with the majority decision serving as the reference standard. The images were evaluated independently by the IDx-DR and RetCAD algorithms. Sensitivity, specificity, positive and negative predictive values, and agreement between algorithms and human graders were calculated and statistically compared. IDx-DR demonstrated higher sensitivity of 99.3% but lower specificity of 68.9% for RDR detection compared to RetCAD which had 89.4% sensitivity and 94.8% specificity. The positive predictive value was higher for RetCAD (96.4% vs 48.1% for IDx-DR) while the negative predictive value was higher for IDx-DR (99.5% vs 83.1% for RetCAD). Both algorithms achieved high sensitivity (> 95%) for sight-threatening diabetic retinopathy detection. In this direct comparison using the same patient cohort, the two algorithms showed differences in their operating parameters for RDR screening. IDx-DR prioritized avoiding false negatives over false positives while RetCAD maintained a more balanced trade-off. These results highlight the variable performance of current artificial intelligence screening solutions and suggest the importance of considering algorithm performance metrics when deploying automated diabetic retinopathy screening programs, based on available healthcare resources.
Read full abstract