Towards a holistic view of bias in machine learning: bridging algorithmic fairness and imbalanced learning

Damien Dablain,Bartosz Krawczyk,Nitesh Chawla

doi:10.1007/s44248-024-00007-1

Abstract

Machine learning (ML) is playing an increasingly important role in rendering decisions that affect a broad range of groups in society. This posits the requirement of algorithmic fairness, which holds that automated decisions should be equitable with respect to protected features (e.g., gender, race). Training datasets can contain both class imbalance and protected feature bias. We postulate that, to be effective, both class and protected feature bias should be reduced—which allows for an increase in model accuracy and fairness. Our method, Fair OverSampling (FOS), uses SMOTE (Chawla in J Artif Intell Res 16:321–357, 2002) to reduce class imbalance and feature blurring to enhance group fairness. Because we view bias in imbalanced learning and algorithmic fairness differently, we do not attempt to balance classes and features; instead, we seek to de-bias features and balance the number of class instances. FOS restores numerical class balance through the creation of synthetic minority class instances and causes a classifier to pay less attention to protected features. Therefore, it reduces bias for both classes and protected features. Additionally, we take a step toward bridging the gap between fairness and imbalanced learning with a new metric, Fair Utility, that measures model effectiveness with respect to accuracy and fairness. Our source code and data are publicly available at https://github.com/dd1github/Fair-Over-Sampling.

Full Text