Abstract
The complexity of state-of-the-art modeling techniques for image classification impedes the ability to explain model predictions in an interpretable way. A counterfactual explanation highlights the parts of an image which, when removed, would change the predicted class. Both legal scholars and data scientists are increasingly turning to counterfactual explanations as these provide a high degree of human interpretability, reveal what minimal information needs to be changed in order to come to a different prediction and do not require the prediction model to be disclosed. Our literature review shows that existing counterfactual methods for image classification have strong requirements regarding access to the training data and the model internals, which often are unrealistic. Therefore, SEDC is introduced as a model-agnostic instance-level explanation method for image classification that does not need access to the training data. As image classification tasks are typically multiclass problems, an additional contribution is the introduction of the SEDC-T method that allows specifying a target counterfactual class. These methods are experimentally tested on ImageNet data, and with concrete examples, we illustrate how the resulting explanations can give insights in model decisions. Moreover, SEDC is benchmarked against existing model-agnostic explanation methods, demonstrating stability of results, computational efficiency and the counterfactual nature of the explanations.
Highlights
The use of advanced machine learning techniques for image classification has known substantial progress over the past years
State-of-the-art image classification models are used in a black-box way, without the ability to explain model decisions
We propose an alternative version SEDC-T in which segments are iteratively removed until a predefined target class is reached
Summary
The use of advanced machine learning techniques for image classification has known substantial progress over the past years. The significant improvements in predictive performance, mainly due to the use of deep learning [33], have come at a cost of increased model complexity and opacity. State-of-the-art image classification models are used in a black-box way, without the ability to explain model decisions. The need for explainability has become an important topic, generally referred to as explainable artificial intelligence (XAI) [2, 5, 21, 24, 38]. Often cited motivations are increased trust in the model, compliance with regulations and laws, and derivation of insights and guidance for model debugging [16, 21]. The lack of explainability is considered a major barrier for the adoption of automated decision making by companies [5, 7, 11, 42].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.