A Generalized Explanation Framework for Visualization of Deep Learning Model Predictions

Pei Wang,Nuno Vasconcelos

doi:10.1109/tpami.2023.3241106

Abstract

Attribution-based explanations are popular in computer vision but of limited use for fine-grained classification problems typical of expert domains, where classes differ by subtle details. In these domains, users also seek understanding of "why" a class was chosen and "why not" an alternative class. A new GenerAlized expLanatiOn fRamEwork (GALORE) is proposed to satisfy all these requirements, by unifying attributive explanations with explanations of two other types. The first is a new class of explanations, denoted deliberative, proposed to address the "why" question, by exposing the network insecurities about a prediction. The second is the class of counterfactual explanations, which have been shown to address the "why not" question but are now more efficiently computed. GALORE unifies these explanations by defining them as combinations of attribution maps with respect to various classifier predictions and a confidence score. An evaluation protocol that leverages object recognition (CUB200) and scene classification (ADE20K) datasets combining part and attribute annotations is also proposed. Experiments show that confidence scores can improve explanation accuracy, deliberative explanations provide insight into the network deliberation process, the latter correlates with that performed by humans, and counterfactual explanations enhance the performance of human students in machine teaching experiments.

Full Text