Abstract

In recent research, human-understandable explanations of machine learning models have received a lot of attention. Often explanations are given in form of model simplifications or visualizations. However, as shown in cognitive science as well as in early AI research, concept understanding can also be improved by the alignment of a given instance for a concept with a similar counterexample. Contrasting a given instance with a structurally similar example which does not belong to the concept highlights what characteristics are necessary for concept membership. Such near misses have been proposed by Winston (Learning structural descriptions from examples, 1970) as efficient guidance for learning in relational domains. We introduce an explanation generation algorithm for relational concepts learned with Inductive Logic Programming (GeNME). The algorithm identifies near miss examples from a given set of instances and ranks these examples by their degree of closeness to a specific positive instance. A modified rule which covers the near miss but not the original instance is given as an explanation. We illustrate GeNME with the well-known family domain consisting of kinship relations, the visual relational Winston arches domain, and a real-world domain dealing with file management. We also present a psychological experiment comparing human preferences of rule-based, example-based, and near miss explanations in the family and the arches domains.

Highlights

  • Explaining classifier decisions has gained much attention in current research

  • Afterwards, we present an algorithmic approach to generate near miss examples in the context of Inductive Logic Programming (ILP) and demonstrate the approach for a generic family domain, a visual domain and a real world domain dealing with file management (Siebers and Schmid 2019)

  • We demonstrate the generation of near miss explanations applying GeNME for the family domain, a relational visual domain of blocksworld arches, and a real world domain dealing with file management

Read more

Summary

Introduction

Explaining classifier decisions has gained much attention in current research. If explanations are intended for the end-user, their main function is to make the human comprehend how the system reached a decision (Miller 2019). A variety of approaches to explainability has been proposed (Adadi and Berrada 2018; Molnar 2019): Explanations can be local—focusing on the current class decision—or global—covering the learned model (Ribeiro et al 2016; Adadi and Berrada 2018). Counterfactuals typically are minimal changes in feature values which would have resulted in a different decision, such as You were denied a loan because your annual income was £30,000. Contrastive explanations have been proposed mainly for image classification. The contrastive explanation method CEM (Dhurandhar et al 2018) highlights what is minimally but critically absent in an image to belong to a given class. An algorithm ProtoDash has been proposed to identify prototypes and criticisms for arbitrary symmetric positive definite kernels which has been applied to both tabular as well as image data

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call