Diabetic retinopathy (DR) is a significant cause of vision impairment globally, emphasizing the importance of timely and precise detection to prevent severe consequences. This study presents an optimized Vision Transformer (ViT) model that incorporates Harris Hawk Optimization (HHO) to improve the automated detection of diabetic retinopathy (DR). The ViT architecture utilizes self-attention mechanisms to capture local and global features in retinal images. Additionally, HHO optimizes key hyperparameters to maximize the performance of the model. The proposed ViT-HHO model achieved exceptional performance on the APTOS-2019 and IDRiD datasets. Specifically, it achieved 99.83 % accuracy, 99.78 % sensitivity, 99.85 % specificity, and 99.80 % AUC-ROC on the APTOS-2019 dataset, surpassing traditional CNNs and alternative optimization techniques. The model exhibited strong generalization on the IDRiID dataset, achieving an accuracy of 99.11 % and an AUC-ROC of 99.12 %. The ViT-HHO model demonstrates the potential for enhancing the clinical detection of diabetic retinopathy (DR), providing high precision and reliability.•An optimized Vision Transformer (ViT) model was developed using HHO for improved detection of Diabetic Retinopathy (DR).•The model was validated on the APTOS-2019 and IDRiID datasets, demonstrating superior accuracy and AUC-ROC metrics.•The model's generalization and robustness were demonstrated through comprehensive performance evaluations.
Read full abstract