Algorithmic Fairness in Biomarker‐Based Machine Learning Models to Predict Alzheimer’s Dementia in Individuals with Mild Cognitive Impairment

Derya Sahin,Joseph Kambeitz,Frank Jessen

doi:10.1002/alz.062125

Abstract

AbstractBackgroundWith increasing availability of large‐scale data sets in biomedical research, machine learning (ML) algorithms rapidly gain importance for research and clinical practice. However, ML models bear the risk of increasing health disparities, whenever trained with imbalanced or biased data. Based on these ethical concerns the WHO guidelines on Artificial Intelligence for Healthcare recommend incorporating algorithmic fairness in ML models by design. Although research on biomarker detection and dementia risk prediction in Alzheimer’s disease increasingly employs ML algorithms, the fairness of these models has not been in focus to date.MethodUsing data from the Alzheimer’s Disease Neuroimaging Initiative database, we generated ML models to predict conversion to Alzheimer’s dementia in 1017 individuals with mild cognitive impairment. Two models were trained based on (1) biomarkers and (2) biomarkers and demographic features. Subsequently, fairness metrics were calculated to investigate prediction disparities in the trained models regarding sensitive attributes sex, education, race and ethnicity.ResultThe models did not show substantial differences in predictive performance. Compared to Non‐Hispanic‐White individuals, true positive rates for Hispanics and for Non‐White individuals were lower in the biomarker model. Inclusion of demographic features to the model led to a further drop of true positive rates and positive predictive values for both Hispanics and Non‐White individuals. None of the models showed any predictive disparities regarding sex and education, with both models having a slightly higher accuracy, positive predictive value, and negative predictive value for women and for individuals with lower educational attainment.ConclusionThis study empirically demonstrates the potential ethical hazards of ML models for the prediction of conversion to Alzheimer’s dementia. While ML models exhibit fair prognostication when based on balanced data regarding sensitive attributes, underrepresentation of specific groups may lead to violations of fairness criteria with important implications for the clinical translation of these models. Such fairness disparities can be present in models based on only biological data and the inclusion of demographic information might exacerbate the predictive disparities in ML‐models.

Full Text