Background and Objective:The classification of diabetic retinopathy (DR) aims to utilize the implicit information in images for early diagnosis, to prevent and mitigate the further worsening of the condition. However, existing methods are often limited by the need to operate within large, annotated datasets to show significant advantages. Additionally, the number of samples for different categories within the dataset needs to be evenly distributed, because the characteristic of sample imbalance distribution can lead to an excessive focus on high-frequency disease categories, while neglecting the less common but equally important disease categories. Therefore, there is an urgent need to develop a new classification method that can effectively alleviate the issue of sample distribution imbalance, thereby enhancing the accuracy of diabetic retinopathy classification. Methods:In this work, we propose MediDRNet, a dual-branch network model based on prototypical contrastive learning. This model adopts prototype contrastive learning, creating prototypes for different levels of lesions, ensuring they represent the core features of each lesion level. It classifies by comparing the similarity between data points and their category prototypes. Our dual-branch network structure effectively resolves the issue of category imbalance and improves classification accuracy by emphasizing subtle differences in retinal lesions. Moreover, our approach combines a dual-branch network with specific lesion-level prototypes for core feature representation and incorporates the convolutional block attention module for enhanced lesion feature identification. Results:Our experiments using both the Kaggle and UWF classification datasets have demonstrated that MediDRNet exhibits exceptional performance compared to other advanced models in the industry, especially on the UWF DR classification dataset where it achieved state-of-the-art performance across all metrics. On the Kaggle DR classification dataset, it achieved the highest average classification accuracy (0.6327) and Macro-F1 score (0.6361). Particularly in the classification tasks for minority categories of diabetic retinopathy on the Kaggle dataset (Grades 1, 2, 3, and 4), the model reached high classification accuracies of 58.08%, 55.32%, 69.73%, and 90.21%, respectively. In the ablation study, the MediDRNet model proved to be more effective in feature extraction from diabetic retinal fundus images compared to other feature extraction methods. Conclusions:This study employed prototype contrastive learning and bidirectional branch learning strategies, successfully constructing a grading system for diabetic retinopathy lesions within imbalanced diabetic retinopathy datasets. Through a dual-branch network, the feature learning branch effectively facilitated a smooth transition of features from the grading network to the classification learning branch, accurately identifying minority sample categories. This method not only effectively resolved the issue of sample imbalance but also provided strong support for the precise grading and early diagnosis of diabetic retinopathy in clinical applications, showcasing exceptional performance in handling complex diabetic retinopathy datasets. Moreover, this research significantly improved the efficiency of prevention and management of disease progression in diabetic retinopathy patients within medical practice. We encourage the use and modification of our code, which is publicly accessible on GitHub: https://github.com/ReinforceLove/MediDRNet.
Read full abstract