On the explanatory power of Boolean decision trees

Gilles Audemard,Steve Bellart,Louenas Bounia,Frédéric Koriche,Jean-Marie Lagniez,Pierre Marquis

doi:10.1016/j.datak.2022.102088

Abstract

Decision trees have long been recognized as models of choice in sensitive applications where interpretability is of paramount importance. In this paper, we examine the computational ability of Boolean decision trees for the explanation purpose. We focus on both abductive explanations (suited to explaining why a given instance has been classified as such by the decision tree at hand) and on contrastive explanations (suited to explaining why a given instance has not been classified by the decision tree as it was expected). More precisely, we are interested in deriving, minimizing, and counting abductive explanations and contrastive explanations. We prove that the set of all irredundant abductive explanations (also known as PI-explanations or sufficient reasons) for an instance given a decision tree can be exponentially larger than the size of the input (the instance and the decision tree). Therefore, generating the full set of sufficient reasons for an instance can be out of reach. In addition, deriving a single sufficient reason, though computationally easy when dealing with decision trees, does not prove enough in general; indeed, two sufficient reasons for the same instance may differ on many features. To deal with this issue and generate synthetic views of the set of all sufficient reasons, we define notions of relevant features and of necessary features that characterize the (possibly negated) features appearing in at least one or in every sufficient reason for an instance, and we show that they can be computed in polynomial time. We also introduce the notion of explanatory importance, that indicates how frequent each (possibly negated) feature is in the set of all sufficient reasons. We show how the explanatory importance of a (possibly negated) feature and the number of sufficient reasons for an instance can be obtained via a model counting operation, which turns out to be practical in many cases. We also explain how to enumerate minimum-size sufficient reasons. We finally show that, unlike sufficient reasons, the set of all contrastive explanations for an instance given a decision tree can be derived, minimized and counted in polynomial time.

Full Text