Focus and Bias: Will It Blend?

Anna Arias-Duart,Dario Garcia-Gasulla,Víctor Giménez-Ábalos,Ferran Parés

doi:10.3233/faia220355

Abstract

One direct application of explainable AI feature attribution methods is to be used for detecting unwanted biases. To do so, domain experts typically have to review explained inputs, checking for the presence of unwanted biases learnt by the model. However, the huge amount of samples the domain experts must review makes this task more challenging as the size of the dataset grows. In an ideal case, domain experts should be provided only with a small number of selected samples containing potential biases. The recently published Focus score seems a promising tool for the selection of samples containing potential unwanted biases. In this work, we conduct a first study in this direction, analyzing the behavior of the Focus score when applied to a biased model. First, we verified that Focus is indeed sensitive to an induced bias. This is assessed by forcing a spurious correlation, training a model using only cats-indoor and dogs-outdoor. We empirically prove that the model learnt to distinguish the contexts (outdoor vs indoor) instead of cat vs dog classes, so ensuring that the model learnt an unwanted bias. Afterwards, we apply the Focus on this biased model showing how the Focus score decreases when the input contains the aforementioned bias. This analysis sheds light on the Focus behavior when applied to a biased model, highlighting its strengths for its use for bias detection.

Full Text