Abstract

While interpretability tools are intended to help people better understand machine learning (ML), we find that they can, in fact, impair understanding. This paper presents a pre-registered, controlled experiment showing that ML practitioners (N=119) spent 5x less time on task, and were 17% less accurate about the data and model, when given access to interpretability tools. We present bounded rationality as the theoretical reason behind these findings. Bounded rationality presumes human departures from perfect rationality, and it is often effectuated by satisficing, i.e., an inclination towards "good enough" understanding. Adding interactive elements---a strategy often employed to promote deliberative thinking and engagement, and tested in our experiment---also does not help. We discuss implications for interpretability designers and researchers related to how cognitive and contextual factors can affect the effectiveness of interpretability tool use.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call