CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models

Arjun R Akula,Keze Wang,Changsong Liu,Sari Saba-Sadiya,Hongjing Lu,Sinisa Todorovic,Joyce Chai,Song-Chun Zhu

doi:10.1016/j.isci.2021.103581

Abstract

SummaryWe propose CX-ToM, short for counterfactual explanations with theory-of-mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e., dialogue between the machine and human user. More concretely, our CX-ToM framework generates a sequence of explanations in a dialogue by mediating the differences between the minds of the machine and human user. To do this, we use Theory of Mind (ToM) which helps us in explicitly modeling the human’s intention, the machine’s mind as inferred by the human, as well as human's mind as inferred by the machine. Moreover, most state-of-the-art XAI frameworks provide attention (or heat map) based explanations. In our work, we show that these attention-based explanations are not sufficient for increasing human trust in the underlying CNN model. In CX-ToM, we instead use counterfactual explanations called fault-lines which we define as follows: given an input image I for which a CNN classification model M predicts class cpred, a fault-line identifies the minimal semantic-level features (e.g., stripes on zebra), referred to as explainable concepts, that need to be added to or deleted from I to alter the classification category of I by M to another specified class calt. Extensive experiments verify our hypotheses, demonstrating that our CX-ToM significantly outperforms the state-of-the-art XAI models.

Highlights

We propose CX-Theory of Mind (ToM), short for counterfactual explanations with theory-of-mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN)
How is human trust measured in CX-ToM? In this work, we focus mainly on measuring and increasing Justified Positive Trust (JPT) and Justified Negative Trust (JNT) (Hoffman et al, 2018) in image classification models
Concept based explanation framework TCAV andcounterfactual explanation frameworks Contrastive Explanation Methods (CEM) and Counterfactual Visual Explanations (CVE) performed significantly better than the NO-X baseline

Summary

Introduction

Intelligence (AI) systems are becoming increasingly ubiquitous from low risk environments such as movie recommendation systems and chatbots to high-risk environments such as medical-diagnosis and treatment, self-driving cars, drones and military applications (Chancey et al, 2015; Gulshan et al, 2016; Lyons et al, 2017; Mnih et al, 2013; Gupta et al, 2012; Pulijala et al, 2013; Dasgupta et al, 2014; Agarwal et al, 2017; Palakurthi et al, 2015; Akula et al, 2021a, 2021b, 2021c, 2021d). Understanding and developing human trust in these systems remains a significant challenge as they cannot explain why they reached a specific recommendation or a decision. This is especially problematic in high-risk environments such as banking, healthcare, and insurance, where AI decisions can have significant consequences. XAI models, through explanations, aim at making the underlying inference mechanism of AI systems transparent and interpretable to expert users (system developers) and nonexpert users (end-users) (Lipton, 2016; Ribeiro et al, 2016; Hoffman, 2017). We focus mainly on increasing justified human trust (JT) in a deep convolutional neural network (CNN), through explanations (Hoffman et al, 2018; Akula et al, 2019a, 2019b).

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: iScience	Publication Date: Dec 11, 2021
Citations: 30	License type: cc-by

R Discovery Prime

R Discovery Prime

CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: iScience

Lead the way for us

Similar Papers

CoCoX: Generating Conceptual and Counterfactual Explanations via Fault-Lines
Arjun Akula ... Song-Chun Zhu
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 34
Arjun Akula, et. al.Arjun Akula ... Song-Chun Zhu
03 Apr 2020
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 34

A Hyperspectral Data 3D Convolutional Neural Network Classification Model for Diagnosis of Gray Mold Disease in Strawberry Leaves.
Dae-Hyun Jung ... Ho-Youn Kim
Frontiers in Plant Science | VOL. 13
Dae-Hyun Jung, et. al.Dae-Hyun Jung ... Ho-Youn Kim
11 Mar 2022
Frontiers in Plant Science | VOL. 13

Deep Learning for Automated Classification ofInferior Vena Cava Filter Types on Radiographs.
Jason C Ni ... Bao H Do
Journal of Vascular and Interventional Radiology | VOL. 31
Jason C Ni, et. al.Jason C Ni ... Bao H Do
18 Sep 2019
Journal of Vascular and Interventional Radiology | VOL. 31

Adversarial Robustness of Deep Convolutional Neural Network-based Image Recognition Models: A Review
...
雷达学报 | VOL. 10
, et. al. ...
28 Aug 2021
雷达学报 | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CX-ToM: Counterfactual explanations with theory-of-mind for enhancing human trust in image recognition models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: iScience