Abstract

Evaluating the trustworthiness of deep learning-based computer-aided diagnosis (CAD) systems is challenging. There is a need to optimize trust and performance in model selection. A wide range of models based on evaluation metrics can make it challenging to choose the best one, especially for complex multi-criteria decision-making problems. In the case of COVID-19 diagnosis, using both physicians’ evaluation and AI techniques to establish trust is essential. In this study, 1551 chest X-rays were analyzed using deep transfer learning (DTL) with six models and four SVM kernels. This resulted in 24 hybrid DTL–SVM models. Seven metrics were evaluated using fuzzy multiple-criteria decision-making (MCDM), which uses a dynamic decision matrix to select the best model. This matrix incorporates fuzzy-weighted zero-inconsistency (FWZIC) for weight coefficients and Kriterijumska Optimizacija I Kompromisno Resenje (VIKOR) for benchmarking. The Grad-CAM technique compared the best model with 16 images, ensuring explainability. Top-performing models were identified, including SqueezeNet-SVM linear, VGG19-SVM linear, and VGG19-SVM. Sensitivity analysis was used to quantify the impact of changing weighted criteria values. A physician expert validated fuzzy MCDM through Grad-CAM analysis, a new aspect of this study. The framework presented in this study was benchmarked against seven other studies and achieved a perfect score in four crucial areas. Trustworthiness is essential for CAD systems, and this study effectively addresses trust and performance challenges in AI model-based CAD systems. The study systematically evaluated the requirements for trustworthiness, including accountability, fairness, robustness, accuracy, and reproducibility, and the results supported this.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call