Abstract
Although future regulations increasingly advocate that AI applications must be interpretable by users, we know little about how such explainability can affect human information processing. By conducting two experimental studies, we help to fill this gap. We show that explanations pave the way for AI systems to reshape users' understanding of the world around them. Specifically, state-of-the-art explainability methods evoke mental model adjustments that are subject to confirmation bias, allowing misconceptions and mental errors to persist and even accumulate. Moreover, mental model adjustments create spillover effects that alter users' behavior in related but distinct domains where they do not have access to an AI system. These spillover effects of mental model adjustments risk manipulating user behavior, promoting discriminatory biases, and biasing decision making. The reported findings serve as a warning that the indiscriminate use of modern explainability methods as an isolated measure to address AI systems' black-box problems can lead to unintended, unforeseen problems because it creates a new channel through which AI systems can influence human behavior in various domains.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have