Why Are Explainable AI Methods for Prostate Lesion Detection Rated Poorly by Radiologists?

Mehmet A Gulum,Christopher M Trombley,Merve Ozen,Enes Esen,Melih Aksamoglu,Mehmed Kantardzic

doi:10.3390/app14114654

Abstract

Deep learning offers significant advancements in the accuracy of prostate identification and classification, underscoring its potential for clinical integration. However, the opacity of deep learning models presents interpretability challenges, critical for their acceptance and utility in medical diagnosis and detection. While explanation methods have been proposed to demystify these models, enhancing their clinical viability, the efficacy and acceptance of these methods in medical tasks are not well documented. This pilot study investigates the effectiveness of deep learning explanation methods in clinical settings and identifies the attributes that radiologists consider crucial for explainability, aiming to direct future enhancements. This study reveals that while explanation methods can improve clinical task performance by up to 20%, their perceived usefulness varies, with some methods being rated poorly. Radiologists prefer explanation methods that are robust against noise, precise, and consistent. These preferences underscore the need for refining explanation methods to align with clinical expectations, emphasizing clarity, accuracy, and reliability. The findings highlight the importance of developing explanation methods that not only improve performance but also are tailored to meet the stringent requirements of clinical practice, thereby facilitating deeper trust and a broader acceptance of deep learning in medical diagnostics.

Full Text