Implicit Diversity in Image Summarization

L Elisa Celis,Vijay Keswani

doi:10.1145/3415210

Abstract

Studies have shown that the people depicted in image search results tend to be of majority groups with respect to socially salient attributes such as gender or race. This skew goes beyond that which already exists in the world - i.e., the search results for images of people are more imbalanced than the ground truth would suggest. For example, Kay et al. showed that although 28% of CEOs in the U.S. are women, only 10% of the top 100 results for "CEO" in Google Image Search are women. Similar observations abound across search terms and across socially salient attributes. Most existing approaches to correct for this kind of bias assume that the images of people include labels denoting the relevant socially salient attributes. These labels are explicitly used to either change the dataset, adjust the training of the algorithm, and/or in the execution of the method. However, such labels are often unknown. Further, using machine learning techniques to infer these labels may often not be possible within acceptable accuracy ranges and may not be desirable due to the additional biases this process could incur. As observed in prior work, alternate approaches consider the diversity of image features, which often do not translate to images of visibly diverse people. We develop a novel approach that takes as input a visibly diverse control set of images of people and uses this set as part of a procedure to select a set of images of people in response to a query. The goal is to have a resulting set that is more visibly diverse in a manner that emulates the diversity depicted in the control set. It accomplishes this by evaluating the similarity of the images selected by a black-box algorithm with the images in the diversity control set, and incorporating this "diversity score" into the final selection process. Importantly, this approach does not require images to be labelled at any point; effectively, it gives a way to implicitly diversify the set of images selected. We provide two variants of our approach: the first is a modification of the well known MMR algorithm to incorporate the diversity scores, and the second is a more efficient variant that does not consider within-list redundancy. We evaluate these approaches empirically on two image datasets: 1) a new dataset we collect which contains the top 100 Google Image results for 96 occupations, for which we evaluate gender and skin-tone diversity with respect to occupations and 2) the well-known CelebA dataset containing images of celebrities for which we can evaluate gender diversity with respect to facial features such as "smiling" or "glasses". Both of our approaches produce image sets that significantly improve the visible diversity of the results (i.e., include a larger fraction of anti-stereotypical images) with respect to current Google Image Search results and other state-of-the-art algorithms for diverse image summarization. Further, they seem to come at a minimal cost to accuracy. These empirical results demonstrate the effectiveness of simple label-independent interventions to diversify image search.

Full Text