A general framework for personalising post hoc explanations through user knowledge integration

Adulam Jeyasothy,Thibault Laugel,Marie-Jeanne Lesot,Christophe Marsala,Marcin Detyniecki

doi:10.1016/j.ijar.2023.108944

Abstract

The field of XAI aims at providing explanations about the behavior of AI methods to a user. In particular, local post-hoc interpretability approaches aim at generating explanations for a particular prediction of a trained machine learning model. It is generally recognized that such explanations should be adapted to each user: integrating user knowledge and taking into account the user specificity allows to provide personalized explanations and to improve the explanation understandability. Yet these elements appear to be rarely taken into account, and only in specific configurations. In this paper, we propose a general framework to allow this integration of user knowledge in post-hoc interpretability methods, relying on the addition of a compatibility term in the cost function. We instantiate the proposed formalization in two scenarios, varying in the explanation form they propose, in the case where the available user knowledge provides information about the data features. As a result, two new explainability methods are proposed, respectively named Knowledge Integration in Counterfactual Explanation (KICE) and Knowledge Integration in Surrogate Model (KISM). These methods are experimentally studied on several benchmark data sets to characterize the explanations they generate as compared to reference methods.

Full Text