Diabetic retinopathy (DR) diagnosis methods in the literature are usually criticized as being limit in diagnosing DR-related features or being lack of interpretability. To deal with these issues, this paper investigates the feasibility of diagnosing both DR severity levels and the presence of DR-related features in a two-step procedure. Specifically, this paper first analyzes the quality of annotations in DR grading by measuring inter-grader variability. Cosine similarity is considered to evaluate the inter-grader variability of the presence of DR-related features, and quadratic weighted Cohen's kappa is employed to assess the inter-grader variability of DR severity levels. Next, different annotation methods as follows are compared to DR severity prediction performance using logistic regression: 1) single annotations by single grader (SASG); 2) single annotations from multiple graders (SAMG); 3) multiple annotations by voting (MAV); and 4) double annotations with adjudication of disagreement (DAAD). Based on the comparison results, the feasibility of diagnosing both DR severity and features is investigated. In the experiments, 1589 fundus images graded by three retinal specialists and four general ophthalmologists are considered. The results demonstrate that retinal specialists are more consistent than general ophthalmologists in grading both the presence of DR-related features and DR severity. The SASG and MAV should be avoided if possible while the DAAD is the good option when prediction performance is the highest priority and the SAMG is especially beneficial when both prediction performance and grading costs are considered. The upper limit performance of DR severity prediction gets accuracy 95.6% and kappa 0.962. When DR-related feature prediction achieves average cosine similarity 0.823, it is potential to get accuracy 91.2% and kappa 0.905 for DR severity prediction in real applications. These results together suggest the potential of diagnosis of both DR severity and the presence of DR-related features in a two-step procedure.
Read full abstract