When More Data Lead Us Astray: Active Data Acquisition in the Presence of Label Bias

Yunyi Li,Maria De-Arteaga,Maytal Saar-Tsechansky

doi:10.1609/hcomp.v10i1.21994

Abstract

An increased awareness concerning risks of algorithmic bias has driven a surge of efforts around bias mitigation strategies. A vast majority of the proposed approaches fall under one of two categories: (1) imposing algorithmic fairness constraints on predictive models, and (2) collecting additional training samples. Most recently and at the intersection of these two categories, methods that propose active learning under fairness constraints have been developed. However, proposed bias mitigation strategies typically overlook the bias presented in the observed labels. In this work, we study fairness considerations of active data collection strategies in the presence of label bias. We first present an overview of different types of label bias in the context of supervised learning systems. We then empirically show that, when overlooking label bias, collecting more data can aggravate bias, and imposing fairness constraints that rely on the observed labels in the data collection process may not address the problem. Our results illustrate the unintended consequences of deploying a model that attempts to mitigate a single type of bias while neglecting others, emphasizing the importance of explicitly differentiating between the types of bias that fairness-aware algorithms aim to address, and highlighting the risks of neglecting label bias during data collection.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

When More Data Lead Us Astray: Active Data Acquisition in the Presence of Label Bias

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing

Lead the way for us

Journal: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing	Publication Date: Oct 14, 2022
Citations: 2

Similar Papers

Rate-maximization scheduling schemes for uplink OFDMA
Yao Ma ... Dong In Kim
IEEE Transactions on Wireless Communications | VOL. 8
Yao Ma, et. al. Yao Ma ... Dong In Kim
01 Jun 2009
IEEE Transactions on Wireless Communications | VOL. 8

HR analytics and the data collection process: the role of attributions and perceived legitimacy in explaining employees’ fear of datafication
Elia Rigamonti ... Mariano Corso
Journal of Organizational Effectiveness: People and Performance | VOL. -
Elia Rigamonti, et. al.Elia Rigamonti ... Mariano Corso
22 May 2024
Journal of Organizational Effectiveness: People and Performance | VOL. -

WHO's Assessment Instrument for Mental Health Systems: Collecting Essential Information for Policy and Service Delivery
Shekhar Saxena ... Antonio Lora
Psychiatric Services | VOL. 58
Shekhar Saxena, et. al.Shekhar Saxena ... Antonio Lora
01 Jun 2007
Psychiatric Services | VOL. 58

Performance Measurement for Accountability: Lessons from the School-to-Work Experience
Robin White ... Elliott Medrich
Phi Delta Kappan | VOL. 84
Robin White, et. al.Robin White ... Elliott Medrich
01 Dec 2002
Phi Delta Kappan | VOL. 84

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

When More Data Lead Us Astray: Active Data Acquisition in the Presence of Label Bias

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Human Computation and Crowdsourcing