How can I choose an explainer?

Sérgio Jesus,Catarina Belém,Pedro Bizarro,João Bento,Vladimir Balayan,João Gama,Pedro Saleiro

doi:10.1145/3442188.3445941

Abstract

There have been several research works proposing new Explainable AI (XAI) methods designed to generate model explanations having specific properties, or desiderata, such as fidelity, robustness, or human-interpretability. However, explanations are seldom evaluated based on their true practical impact on decision-making tasks. Without that assessment, explanations might be chosen that, in fact, hurt the overall performance of the combined system of ML model + end-users. This study aims to bridge this gap by proposing XAI Test, an application-grounded evaluation methodology tailored to isolate the impact of providing the end-user with different levels of information. We conducted an experiment following XAI Test to evaluate three popular post-hoc explanation methods -- LIME, SHAP, and TreeInterpreter -- on a real-world fraud detection task, with real data, a deployed ML model, and fraud analysts. During the experiment, we gradually increased the information provided to the fraud analysts in three stages: Data Only, i.e., just transaction data without access to model score nor explanations, Data + ML Model Score, and Data + ML Model Score + Explanations. Using strong statistical analysis, we show that, in general, these popular explainers have a worse impact than desired. Some of the conclusion highlights include: i) showing Data Only results in the highest decision accuracy and the slowest decision time among all variants tested, ii) all the explainers improve accuracy over the Data + ML Model Score variant but still result in lower accuracy when compared with Data Only; iii) LIME was the least preferred by users, probably due to its substantially lower variability of explanations from case to case.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

How can I choose an explainer?

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Peeking into the model. Subgroup discovery for extracting comprehensible interpretations of underlying catalysts for wildfires from a machine learning model
Hans Korving ... Margreet Van Marle
-
Hans Korving, et. al.Hans Korving ... Margreet Van Marle
11 Mar 2024
11 Mar 2024

A Literature Review and Research Agenda on Explainable Artificial Intelligence (XAI)
Krishna Prakash Kalyanathaya ... Krishna Prasad K
International Journal of Applied Engineering and Management Letters | VOL. -
Krishna Prakash Kalyanathaya, et. al.Krishna Prakash Kalyanathaya ... Krishna Prasad K
08 Feb 2022
International Journal of Applied Engineering and Management Letters | VOL. -

Differential Biases and Variabilities of Deep Learning-Based Artificial Intelligence and Human Experts in Clinical Diagnosis: Retrospective Cohort and Survey Study.
Dongchul Cha ... Sung Huhn Kim
JMIR Medical Informatics | VOL. 9
Dongchul Cha, et. al.Dongchul Cha ... Sung Huhn Kim
08 Dec 2021
JMIR Medical Informatics | VOL. 9

Air Quality Analysis through IoT Device and Risk Prediction of Asthma Attack using ML Techniques
Avishek Banerjee ... Akash Yadav
Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) | VOL. 17
Avishek Banerjee, et. al.Avishek Banerjee ... Akash Yadav
16 Aug 2024
Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

How can I choose an explainer?

Abstract

Talk to us

Similar Papers