Abstract

Machine learning (ML) is playing an increasingly important role in rendering decisions that affect a broad range of groups in society. This posits the requirement of algorithmic fairness, which holds that automated decisions should be equitable with respect to protected features (e.g., gender, race). Training datasets can contain both class imbalance and protected feature bias. We postulate that, to be effective, both class and protected feature bias should be reduced—which allows for an increase in model accuracy and fairness. Our method, Fair OverSampling (FOS), uses SMOTE (Chawla in J Artif Intell Res 16:321–357, 2002) to reduce class imbalance and feature blurring to enhance group fairness. Because we view bias in imbalanced learning and algorithmic fairness differently, we do not attempt to balance classes and features; instead, we seek to de-bias features and balance the number of class instances. FOS restores numerical class balance through the creation of synthetic minority class instances and causes a classifier to pay less attention to protected features. Therefore, it reduces bias for both classes and protected features. Additionally, we take a step toward bridging the gap between fairness and imbalanced learning with a new metric, Fair Utility, that measures model effectiveness with respect to accuracy and fairness. Our source code and data are publicly available at https://github.com/dd1github/Fair-Over-Sampling.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call