Assessing Heuristic Machine Learning Explanations with Model Counting

Nina Narodytska,Joao Marques-Silva,Aditya Shrotri,Kuldeep S Meel,Alexey Ignatiev

doi:10.1007/978-3-030-24258-9_19

Abstract

Machine Learning (ML) models are widely used in decision making procedures in finance, medicine, education, etc. In these areas, ML outcomes can directly affect humans, e.g. by deciding whether a person should get a loan or be released from prison. Therefore, we cannot blindly rely on black box ML models and need to explain the decisions made by them. This motivated the development of a variety of ML-explainer systems, including LIME and its successor $${\textsc {Anchor}}$$ . Due to the heuristic nature of explanations produced by existing tools, it is necessary to validate them. We propose a SAT-based method to assess the quality of explanations produced by $${\textsc {Anchor}}$$ . We encode a trained ML model and an explanation for a given prediction as a propositional formula. Then, by using a state-of-the-art approximate model counter, we estimate the quality of the provided explanation as the number of solutions supporting it.

Full Text