Generating Visual and Semantic Explanations with Multi-task Network

Wenjia Xu,Yirong Wu,Zeynep Akata,Yang Wang,Jiuniu Wang

doi:10.1007/978-3-030-66415-2_40

Abstract

Explaining deep models is desirable especially for improving the user trust and experience. Much progress has been done recently towards visually and semantically explaining deep models. However, establishing the most effective explanation is often human-dependent, which suffers from the bias of the annotators. To address this issue, we propose a multitask learning network (MTL-Net) that generates saliency-based visual explanation as well as attribute-based semantic explanation. Via an integrated evaluation mechanism, our model quantitatively evaluates the quality of the generated explanations. First, we introduce attributes to the image classification process and rank the attribute contribution with gradient weighted mapping, then generate semantic explanations with those attributes. Second, we propose a fusion classification mechanism (FCM) to evaluate three recent saliency-based visual explanation methods by their influence on the classification. Third, we conduct user studies, quantitative and qualitative evaluations. According to our results on three benchmark datasets with varying size and granularity, our attribute-based semantic explanations are not only helpful to the user but they also improve the classification accuracy of the model, and our ranking framework detects the best performing visual explanation method in agreement with the users.

Full Text