The Meaning of “Explainability Fosters Trust in AI”

Andrea Ferrario,Michele Loi

doi:10.2139/ssrn.3916396

Abstract

We provide a philosophical explanation of the relation between artificial intelligence (AI) explainability and trust in AI, providing a case for expressions, such as “explainability fosters trust in AI,” that commonly appear in the literature. This explanation considers the justification of the trustworthiness of an AI with the need to monitor it during its use. We discuss the latter by referencing an account of trust, called “trust as anti-monitoring,” that different authors contributed developing. We focus our analysis on the case of medical AI systems, noting that our proposal is compatible with internalist and externalist justifications of trustworthiness of medical AI and recent accounts of warranted contractual trust. We propose that “explainability fosters trust in AI” if and only if it fosters warranted paradigmatic trust in AI, i.e., trust in the presence of the justified belief that the AI is trustworthy, which, in turn, causally contributes to rely on the AI in the absence of monitoring. We argue that our proposed approach can intercept the complexity of the interactions between physicians and medical AI systems in clinical practice, as it can distinguish between cases where humans hold different beliefs on the trustworthiness of the medical AI and exercise varying degrees of monitoring on them. Finally, we discuss the main limitations of explainable AI methods and we argue that the case of “explainability fosters trust in AI” may be feasible only in a limited number of physician-medical AI interactions in clinical practice.

Full Text