Interpretability-Based Multimodal Convolutional Neural Networks for Skin Lesion Diagnosis.

Sutong Wang,Dujuan Wang,Yunqiang Yin,Yanzhang Wang,Yaochu Jin

doi:10.1109/tcyb.2021.3069920

Abstract

Skin lesion diagnosis is a key step for skin cancer screening, which requires high accuracy and interpretability. Though many computer-aided methods, especially deep learning methods, have made remarkable achievements in skin lesion diagnosis, their generalization and interpretability are still a challenge. To solve this issue, we propose an interpretability-based multimodal convolutional neural network (IM-CNN), which is a multiclass classification model with skin lesion images and metadata of patients as input for skin lesion diagnosis. The structure of IM-CNN consists of three main paths to deal with metadata, features extracted from segmented skin lesion with domain knowledge, and skin lesion images, respectively. We add interpretable visual modules to provide explanations for both images and metadata. In addition to area under the ROC curve (AUC), sensitivity, and specificity, we introduce a new indicator, an AUC curve with a sensitivity larger than 80% (AUC_SEN_80) for performance evaluation. Extensive experimental studies are conducted on the popular HAM10000 dataset, and the results indicate that the proposed model has overwhelming advantages compared with popular deep learning models, such as DenseNet, ResNet, and other state-of-the-art models for melanoma diagnosis. The proposed multimodal model also achieves on average 72% and 21% improvement in terms of sensitivity and AUC_SEN_80, respectively, compared with the single-modal model. The visual explanations can also help gain trust from dermatologists and realize man-machine collaborations, effectively reducing the limitation of black-box models in supporting medical decision making.

Full Text