Purpose: This study aimed to assess an artificial intelligence (AI) model's performance in glaucoma detection using the RIM One public database. Specifically, we examined the impact of considering only high‐confidence predictions from the AI model on overall diagnostic accuracy.Methods: The RIM One database provided 485 images, including 313 normal and 172 glaucomatous cases. Of these, 248 images were for training the EfficientNetV2B0 model, 63 for validation, and 174 for testing. Initially, the model's performance was evaluated on the entire test set, yielding an AUC of 96%. Subsequently, we investigated the model's performance when limiting predictions to those with a certainty probability over 95%, reducing the test set to 155 images.Results: The AI model achieved an AUC of 96% when considering all predictions. However, focusing solely on high‐confidence predictions ( > 95% certainty) increased the AUC to 100%. This adjustment reduced the test set size from 174 to 155 images. Statistical analysis using the De Long Test revealed a significant difference between the two AUC values, highlighting the efficacy of high‐confidence predictions.Conclusion: Our findings demonstrate that by considering only AI predictions with a certainty probability exceeding 95%, we can significantly enhance glaucoma detection's diagnostic accuracy. This approach resulted in a perfect AUC, indicating robust performance in identifying glaucomatous cases. The reduction in test set size suggests that 19 cases may require further evaluation through methods such as visual field testing or optical coherence tomography (OCT) to confirm or exclude the diagnosis of glaucoma. This selective approach not only improves the efficiency of glaucoma diagnosis but also streamlines clinical decision‐making by directing resources to cases with the highest predictive certainty.
Read full abstract