Tackling bias in the data for breast cancer prediction using machine learning-based decision support

Shuning Yin,Gaurav Nanda,Raji Sundararajan

doi:10.1080/24709360.2023.2207919

Abstract

In this study, a machine learning (ML)-based decision support approach was developed to identify breast cancer likelihood in patients, based on their background and physiological data. Two ML models, Naïve Bayes and Logistic Regression were used to evaluate the Breast Cancer Surveillance Consortium dataset that had about 9:1 ratio of non-cancer cases (‘Class 0’) to cancer cases (‘Class 1’). We manually built both balanced and unbalanced training datasets and a non-overlapping testing dataset using a stratified sampling method. For each model, we partitioned the prediction results on testing set into two groups, the ‘Agree’ group included cases where balanced and unbalanced ML predictions agreed, and the remaining cases come under ‘Disagree’ group. Sensitivity and Positive Predictive Value were used as the prediction performance measures. For Naïve Bayes, the sensitivity of Class 1 in regular versus ‘Agree’ group increased from 0.687 to 0.936 and for Logistic Regression, it increased from 0.358 to 0.8306. This indicates the ‘Agree’ group predictions were more accurate and could be labeled as high-confidence ML predictions. The ‘Agree’ group consisted of 89% cases in the testing set, so the improved prediction performance was applicable for a large portion of the testing dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Tackling bias in the data for breast cancer prediction using machine learning-based decision support

Abstract

Talk to us

Similar Papers

More From: Biostatistics & Epidemiology

Lead the way for us

Similar Papers

Predicting 30-Day Readmissions in Patients With Heart Failure Using Administrative Data: A Machine Learning Approach
Vishal Sharma ... Salim Samanani
Journal of Cardiac Failure | VOL. 28
Vishal Sharma, et. al.Vishal Sharma ... Salim Samanani
20 Dec 2021
Journal of Cardiac Failure | VOL. 28

A Primer on Machine Learning.
Audrene S Edwards ... Bruce Kaplan
Transplantation | VOL. 105
Audrene S Edwards, et. al.Audrene S Edwards ... Bruce Kaplan
18 Aug 2020
Transplantation | VOL. 105

Selected aspects of prior and likelihood information for a Bayesian classifier in a road safety analysis
Marzena Nowakowska
Accident Analysis and Prevention | VOL. 101
Marzena NowakowskaMarzena Nowakowska
14 Feb 2017
Accident Analysis and Prevention | VOL. 101

Abstract P3-03-24: Machine learning approach to predict the level of tumor-infiltrating lymphocytes of breast cancer via MRI-based radiomics
Yuhong Huang ... Ying Lin
Cancer Research | VOL. 82
Yuhong Huang, et. al.Yuhong Huang ... Ying Lin
15 Feb 2022
Cancer Research | VOL. 82

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tackling bias in the data for breast cancer prediction using machine learning-based decision support

Abstract

Talk to us

Similar Papers

More From: Biostatistics & Epidemiology