Addressing bias in bagging and boosting regression models.

Juliette Ugirumurera,Erik A Bensen,Joseph Severino,Jibonananda Sanyal

doi:10.1038/s41598-024-68907-5

Abstract

As artificial intelligence (AI) becomes widespread, there is increasing attention on investigating bias in machine learning (ML) models. Previous research concentrated on classification problems, with little emphasis on regression models. This paper presents an easy-to-apply and effective methodology for mitigating bias in bagging and boosting regression models, that is also applicable to any model trained through minimizing a differentiable loss function. Our methodology measures bias rigorously and extends the ML model's loss function with a regularization term to penalize high correlations between model errors and protected attributes. We applied our approach to three popular tree-based ensemble models: a random forest model (RF), a gradient-boosted model (GBT), and an extreme gradient boosting model (XGBoost). We implemented our methodology on a case study for predicting road-level traffic volume, where RF, GBT, and XGBoost models were shown to have high accuracy. Despite high accuracy, the ML models were shown to perform poorly on roads in minority-populated areas. Our bias mitigation approach reduced minority-related bias by over 50%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Addressing bias in bagging and boosting regression models.

Abstract

Talk to us

Similar Papers

More From: Scientific reports

Lead the way for us

Similar Papers

Development of interpretable machine learning models to predict in-hospital prognosis of acute heart failure patients.
Munekazu Tanaka ... Takeshi Kimura
ESC heart failure | VOL. 11
Munekazu Tanaka, et. al.Munekazu Tanaka ... Takeshi Kimura
15 May 2024
ESC heart failure | VOL. 11

Performance Comparison of Various Machine Learning Models for Rainfall Prediction Using Wind Information
Vishal Jain
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 08
Vishal JainVishal Jain
04 May 2024
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 08

Interpretable Machine Learning for Mode Choice Modeling on Tracking-Based Revealed Preference Data
Victoria Dahmen ... Klaus Bogenberger
Transportation Research Record: Journal of the Transportation Research Board | VOL. -
Victoria Dahmen, et. al.Victoria Dahmen ... Klaus Bogenberger
23 May 2024
Transportation Research Record: Journal of the Transportation Research Board | VOL. -

Machine learning algorithm to predict the in-hospital mortality in critically ill patients with chronic kidney disease
Xunliang Li ... Deguang Wang
Renal Failure | VOL. 45
Xunliang Li, et. al.Xunliang Li ... Deguang Wang
19 May 2023
Renal Failure | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Addressing bias in bagging and boosting regression models.

Abstract

Talk to us

Similar Papers

More From: Scientific reports