Introduction: One of the major public health problems is anemia, especially affecting newborn and infant children, adolescent girls, young women, pregnant women, and postpartum women. The cause of anemia is the reduced supply of red blood cells in the human body or the damage or weakening of the structure of red blood cells. One of the preferences of utilizing machine learning is the prediction of results. Objective: The purpose of this study is to compare effective algorithms, related to the origin or source of the data set, data set size, metric evaluation and accuracy and produce predictors in predicting anemia using machine learning. Method: This research uses a scoping review method on 4 databases, namely Scopus, EBSCO, PubMed, and IEEE Xplore from 2019 - 2024 with keywords anemia, algorithms, machine learning, and prediction. The results of screening articles on the Scopus, EBSCO, PubMed, and IEEE Xplore databases obtained 384 articles which were then selected through several stages and obtained 9 articles. Result: The review found that the highest algorithm performance in anemia prediction, namely Penalized Regression (LASSO regression) accuracy above 64%, XGboost accuracy 100% and execution time 0.2404 seconds, Catboost accuracy 97.6%, Random Forest accuracy 95.49% and 72%, J48 algorithm accuracy of 97.7%, Logistic Regression accuracy 66% and AUC 69%, and SVM linear AUC 79.9%. Conclusion: Machine learning can assist in the development of anemia prediction models by exploring large amounts of data and producing precise and fast predictors. The predictors obtained are determined by the selection of algorithms in the study.
Read full abstract