A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data

Raphael Mazzine Barbosa De Oliveira,David Martens

doi:10.3390/app11167274

Abstract

Counterfactual explanations are viewed as an effective way to explain machine learning predictions. This interest is reflected by a relatively young literature with already dozens of algorithms aiming to generate such explanations. These algorithms are focused on finding how features can be modified to change the output classification. However, this rather general objective can be achieved in different ways, which brings about the need for a methodology to test and benchmark these algorithms. The contributions of this work are manifold: First, a large benchmarking study of 10 algorithmic approaches on 22 tabular datasets is performed, using nine relevant evaluation metrics; second, the introduction of a novel, first of its kind, framework to test counterfactual generation algorithms; third, a set of objective metrics to evaluate and compare counterfactual results; and, finally, insight from the benchmarking results that indicate which approaches obtain the best performance on what type of dataset. This benchmarking study and framework can help practitioners in determining which technique and building blocks most suit their context, and can help researchers in the design and evaluation of current and future counterfactual generation algorithms. Our findings show that, overall, there’s no single best algorithm to generate counterfactual explanations as the performance highly depends on properties related to the dataset, model, score, and factual point specificities.

Highlights

Machine learning algorithms are becoming ever more common in our daily lives [1].One of the reasons for this widespread application is the high prediction accuracy those methods can achieve
Considering the counterfactual explanations that do not need to be realistic, we find for categorical datasets that, GS, LO, and SY are the best ranked, if we consider a large number of categorical columns and models that have a large number of neurons, SE is the best algorithm by far if the factual point class is highly unbalanced
End-users may evaluate their specific requirements based on metrics like the ones we presented in this article to select which algorithmic implementation will mostly correspond to their expectation

Summary

Introduction

Machine learning algorithms are becoming ever more common in our daily lives [1].One of the reasons for this widespread application is the high prediction accuracy those methods can achieve. The inability to explain why certain predictions are made can have a drastic impact on the adoption of automated decision-making in society, as people are often reluctant to use such complex models, even if they are known to improve the predictive performance [3,4,5,6,7,8,9]. These models may hide unfair biases that discriminate against sensitive groups [10,11,12,13]. These problems can be even more critical when models are constantly created and updated, as often observed in real-time applications [14]

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Aug 7, 2021
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Enhancing the correlation between the quality and intelligibility objective metrics with the subjective scores by shallow feed forward neural network for time–frequency masking speech separation algorithms
Sania Gul ... Sheheryar
Applied Acoustics | VOL. 188
Sania Gul, et. al.Sania Gul ... Sheheryar
04 Dec 2021
Applied Acoustics | VOL. 188

Introducing User Feedback-Based Counterfactual Explanations (UFCE)
Muhammad Suffian ... Alessandro Bogliolo
International Journal of Computational Intelligence Systems | VOL. 17
Muhammad Suffian, et. al.Muhammad Suffian ... Alessandro Bogliolo
21 May 2024
International Journal of Computational Intelligence Systems | VOL. 17

The development & implementation of a 3D printing perfused hydrogel Robotic Assisted Partial Nephrectomy Surgical Training Platform: Advancing from generic to patient specific simulation-based translational research
Ahmed Ghazi ... Jean Joseph
Urology Video Journal | VOL. 17
Ahmed Ghazi, et. al.Ahmed Ghazi ... Jean Joseph
30 Dec 2022
Urology Video Journal | VOL. 17

On the robustness of sparse counterfactual explanations to adverse perturbations
Marco Virgolin ... Saverio Fracaros
Artificial Intelligence | VOL. 316
Marco Virgolin, et. al.Marco Virgolin ... Saverio Fracaros
16 Dec 2022
Artificial Intelligence | VOL. 316

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences