Understanding the decisions of CNNs: An in-model approach

Isabel Rio-Torto,Kelwin Fernandes,Luís F Teixeira

doi:10.1016/j.patrec.2020.04.004

Abstract

With the outstanding predictive performance of Convolutional Neural Networks on different tasks and their widespread use in real-world scenarios, it is essential to understand and trust these black-box models. While most of the literature focuses on post-model methods, we propose a novel in-model joint architecture, composed by an explainer and a classifier. This architecture outputs not only a class label, but also a visual explanation of such decision, without the need for additional labelled data to train the explainer besides the image class. The model is trained end-to-end, with the classifier taking as input an image and the explainer’s resulting explanation, thus allowing for the classifier to focus on the relevant areas of such explanation. Moreover, this approach can be employed with any classifier, provided that the necessary connections to the explainer are made. We also propose a three-phase training process and two alternative custom loss functions that regularise the produced explanations and encourage desired properties, such as sparsity and spatial contiguity. The architecture was validated in two datasets (a subset of ImageNet and a cervical cancer dataset) and the obtained results show that it is able to produce meaningful image- and class-dependent visual explanations, without direct supervision, aligned with intuitive visual features associated with the data. Quantitative assessment of explanation quality was conducted through iterative perturbation of the input image according to the explanation heatmaps. The impact on classification performance is studied in terms of average function value and AOPC (Area Over the MoRF (Most Relevant First) Curve). For further evaluation, we propose POMPOM (Percentage of Meaningful Pixels Outside the Mask) as another measurable criteria of explanation goodness. These analyses showed that the proposed method outperformed state-of-the-art post-model methods, such as LRP (Layer-wise Relevance Propagation).

Full Text