Abstract

PurposeHeatmapping techniques can support explainability of deep learning (DL) predictions in medical image analysis. However, individual techniques have been mainly applied in a descriptive way without an objective and systematic evaluation. We investigated comparative performances using diabetic retinopathy lesion detection as a benchmark task.MethodsThe Indian Diabetic Retinopathy Image Dataset (IDRiD) publicly available database contains fundus images of diabetes patients with pixel level annotations of diabetic retinopathy (DR) lesions, the ground truth for this study. Three in advance trained DL models (ResNet50, VGG16 or InceptionV3) were used for DR detection in these images. Next, explainability was visualized with each of the 10 most used heatmapping techniques. The quantitative correspondence between the output of a heatmap and the ground truth was evaluated with the Explainability Consistency Score (ECS), a metric between 0 and 1, developed for this comparative task.ResultsIn case of the overall DR lesions detection, the ECS ranged from 0.21 to 0.51 for all model/heatmapping combinations. The highest score was for VGG16+Grad-CAM (ECS = 0.51; 95% confidence interval [CI]: [0.46; 0.55]). For individual lesions, VGG16+Grad-CAM performed best on hemorrhages and hard exudates. ResNet50+SmoothGrad performed best for soft exudates and ResNet50+Guided Backpropagation performed best for microaneurysms.ConclusionsOur empirical evaluation on the IDRiD database demonstrated that the combination DL model/heatmapping affects explainability when considering common DR lesions. Our approach found considerable disagreement between regions highlighted by heatmaps and expert annotations.Translational RelevanceWe warrant a more systematic investigation and analysis of heatmaps for reliable explanation of image-based predictions of deep learning models.

Highlights

  • With deep learning (DL) we can achieve excellent diagnostic performance on a wide range of medical image analysis tasks.[1,2,3] the use of DL in clinical decision-making is still challenging and involves demonstrating the clinical utility of the algorithm, obtaining regulatory approval, and building trust and approval of the medical practitioner and patient

  • Our empirical evaluation on the Indian Diabetic Retinopathy Image Dataset (IDRiD) database demonstrated that the combination DL model/heatmapping affects explainability when considering common Diabetic retinopathy (DR) lesions

  • Values of Explainability Consistency Score (ECS) between 0.21 to 0.51 illustrate that the choice of the architecture and heatmapping technique impact how much visual overlap there will be with the actual DR lesion segmentation done by experts

Read more

Summary

Introduction

With deep learning (DL) we can achieve excellent diagnostic performance on a wide range of medical image analysis tasks.[1,2,3] the use of DL in clinical decision-making is still challenging and involves demonstrating the clinical utility of the algorithm, obtaining regulatory approval, and building trust and approval of the medical practitioner and patient In this context, one of the tasks is to explain how and why a DL algorithm, which is often conceived as a black box model, makes a particular prediction.[4] A wide range of heatmapping techniques has been introduced to produce visual maps to highlight regions in the image that contribute most to the prediction and explain the algorithm’s decision.[5,6,7]. Heatmaps build trust in the model when they corroborate clinically

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.