Abstract
As machine learning has gradually entered into ever more sectors of public and private life, there has been a growing demand for algorithmic explainability. How can we make the predictions of complex statistical models more intelligible to end users? A subdiscipline of computer science known as interpretable machine learning (IML) has emerged to address this urgent question. Numerous influential methods have been proposed, from local linear approximations to rule lists and counterfactuals. In this article, I highlight three conceptual challenges that are largely overlooked by authors in this area. I argue that the vast majority of IML algorithms are plagued by (1) ambiguity with respect to their true target; (2) a disregard for error rates and severe testing; and (3) an emphasis on product over process. Each point is developed at length, drawing on relevant debates in epistemology and philosophy of science. Examples and counterexamples from IML are considered, demonstrating how failure to acknowledge these problems can result in counterintuitive and potentially misleading explanations. Without greater care for the conceptual foundations of IML, future work in this area is doomed to repeat the same mistakes.
Highlights
Machine learning (ML) is ubiquitous in modern society
Though the models we seek to explain with interpretable machine learning (IML) tools are typically empirical risk minimization (ERM) algorithms, the causal nature of this undertaking arguably demands an structural causal models (SCMs) approach
No matter the details of particular testing methods, the point I want to stress is that minimizing expected errors of the first and second kind is an obvious desideratum for any inference procedure, not to mention a faithful description of most modern scientific practice
Summary
Machine learning (ML) is ubiquitous in modern society. Complex learning algorithms are widely deployed in private industries like finance (Heaton et al, 2017) and insurance (Lin et al, 2017), as well as public services such as healthcare (Topol, 2019) and education (Peters, 2018). Their prevalence is largely driven by results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have