Abstract

Machine learning (ML) models are increasingly being used for high-stake applications that can greatly impact people’s lives. Sometimes, these models can be biased toward certain social groups on the basis of race, gender, or ethnicity. Many prior works have attempted to mitigate this “model discrimination” by updating the training data (pre-processing), altering the model learning process (in-processing), or manipulating the model output (post-processing). However, more work can be done in extending this situation to intersectional fairness, where we consider multiple sensitive parameters (e.g., race) and sensitive options (e.g., black or white), thus allowing for greater real-world usability. Prior work in fairness has also suffered from an accuracy–fairness trade-off that prevents both accuracy and fairness from being high. Moreover, the previous literature has not clearly presented holistic fairness metrics that work with intersectional fairness. In this paper, we address all three of these problems by (a) creating a bias mitigation technique called DualFair and (b) developing a new fairness metric (i.e., AWI, a measure of bias of an algorithm based upon inconsistent counterfactual predictions) that can handle intersectional fairness. Lastly, we test our novel mitigation method using a comprehensive U.S. mortgage lending dataset and show that our classifier, or fair loan predictor, obtains relatively high fairness and accuracy metrics.

Highlights

  • Andreas Holzinger and Fang ChenMachine learning (ML) models have enabled automated decision making in a variety of fields, ranging from lending to hiring to criminal justice

  • Through our experience with DualFair, we argue that it is possible to debias data with multiple sensitive parameters and sensitive options, given a proper pipeline, approach, and data

  • We showed that DualFair can be applied to the Home Mortgage Disclosure Act (HMDA) dataset to create a high-performing fair ML classifier in the mortgage-lending domain

Read more

Summary

Introduction

Machine learning (ML) models have enabled automated decision making in a variety of fields, ranging from lending to hiring to criminal justice. Our Contributions: In this paper, we target all three of the previously stated problems to develop a novel and real-world applicable fair ML classifier in the mortgage-lending domain that obtains relatively high fairness and accuracy metrics. Through this process, we coin a bias mitigation pipeline called DualFair (a pre-processing strategy), which approaches intersectional fairness through data sampling techniques, and solves problems hindering the growth of the “Fairness, Accountability, and Transparency”, or FAT, field.

Related Work
Fairness Terminology
Mortgage Data
Debiasing
Novel Fairness Metrics
Experimental Design
RQ1: How Well Does DualFair Create an Intersectional Fair Loan Classifier?
RQ2: Does DualFair Eliminate the Accuracy–Fairness Trade-off?
RQ3: Is DualFair Capable of Capturing All Sensitive Parameters and Sensitive Options in HMDA Data?
Future Works
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call