Auditing Search Engines for Differential Satisfaction Across Demographics

Rishabh Mehrotra,Hanna Wallach,Amit Sharma,Emine Yilmaz,Ashton Anderson,Fernando Diaz

doi:10.1145/3041021.3054197

Rishabh Mehrotra, Hanna Wallach + Show 4 more

Open Access

https://doi.org/10.1145/3041021.3054197

Copy DOI

Publication Date: Jan 1, 2017
Citations: 61	License type: cc-by

Affiliation: Microsoft (United States), University College London

Abstract

Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users. In this paper, we present a framework for internally auditing such services for differences in user satisfaction across demographic groups, using search engines as a case study. We first explain the pitfalls of na\"ively comparing the behavioral metrics that are commonly used to evaluate search engines. We then propose three methods for measuring latent differences in user satisfaction from observed differences in evaluation metrics. To develop these methods, we drew on ideas from the causal inference literature and the multilevel modeling literature. Our framework is broadly applicable to other online services, and provides general insight into interpreting their evaluation metrics.

Highlights

Modern search engines are complex, relying heavily on machine learning methods to optimize search results for user satisfaction
Search engines are often evaluated using metrics based on behavioral signals, several studies have suggested that these metrics are sensitive to a variety of factors: Hassan and White [26] demonstrated that evaluation metric values vary dramatically by user; Carterette et al [10] made a similar observation and incorporated user variability into evaluation metrics; and Borisov et al studied the degree to which metrics are sensitive to a user’s search context [8]
Auditing search engines for equal access is much more complicated than comparing evaluation metrics for demographically binned search impressions. We addressed this challenge by proposing three methods for measuring latent differences in user satisfaction from observed differences in evaluation metrics

Summary

INTRODUCTION

Modern search engines are complex, relying heavily on machine learning methods to optimize search results for user satisfaction. One way to assess whether a search engine provides equal access is to look for differences in user satisfaction across demographic groups. Considering the average value of the metric across all users will underemphasize the effectiveness of the search engine on retirement planning queries. Context matching, controls for two confounding contextual differences: the query itself and the intent of the user (section 5). Because this method attempts to match users’ search contexts as closely as possible, it can only be applied to a restricted set of queries. Our second method is a multilevel model for the effect of query difficulty on evaluation metrics (section 6) This method controls for fewer confounding factors, but is more generalizable. For comparison, we used our third method to conduct an external audit of a leading competitor to Bing using publicly available data from comScore (section 8)

Fairness in Machine Learning

Demographics and Web Search

User Satisfaction in Web Search

DATA AND METRICS

Differences in Queries

Differences in Evaluation Metrics

CONTEXT MATCHING

MULTILEVEL MODELING

ESTIMATING DIFFERENCES

EXTERNAL AUDITING

DISCUSSION

Findings

10. REFERENCES

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Auditing Search Engines for Differential Satisfaction Across Demographics

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A comprehensive and systematic model of user evaluation of Web search engines: II. An evaluation by undergraduates
Louise T Su
Journal of the American Society for Information Science and Technology | VOL. 54
Louise T SuLouise T Su
02 Oct 2003
Journal of the American Society for Information Science and Technology | VOL. 54

How Well do Offline and Online Evaluation Metrics Measure User Satisfaction in Web Image Search?
Fan Zhang ... Ke Zhou
-
Fan Zhang, et. al.Fan Zhang ... Ke Zhou
27 Jun 2018
27 Jun 2018

Design of hand-held remotes for older persons with impairments.
Machiko R Tomita ... Kenneth J Ottenbacher
Assistive technology : the official journal of RESNA | VOL. 6
Machiko R Tomita, et. al.Machiko R Tomita ... Kenneth J Ottenbacher
31 Dec 1994
Assistive technology : the official journal of RESNA | VOL. 6

Models Versus Satisfaction
Yiqun Liu ... Xiaohui Xie
-
Yiqun Liu, et. al.Yiqun Liu ... Xiaohui Xie
25 Jul 2020
25 Jul 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Auditing Search Engines for Differential Satisfaction Across Demographics

Abstract

Highlights

Summary

Talk to us

Similar Papers