Abstract

Anemia, a prevalent hematologic disorder, necessitates accurate and timely diagnosis for effective management and treatment. This study explores the application of various machine learning models to classify anemia types using complete blood count (CBC) data. We evaluated multiple models, including DecisionTreeClassifier, ExtraTreeClassifier, RandomForestClassifier, ExtraTreesClassifier, XGBoost, LightGBM, and CatBoost, to identify the most effective approach for anemia diagnosis. The dataset comprised CBC data labeled with anemia diagnoses, sourced from multiple medical facilities. Rigorous data preprocessing was performed, followed by feature selection using methods such as Variance Inflation Factor (VIF), Predictive Power Score (PPS), and feature importance from ensemble models. The models were trained and evaluated using 5-fold cross-validation, with hyperparameter tuning conducted via GridSearchCV. Results demonstrated that the DecisionTreeClassifier achieved the highest balanced accuracy score of 94.17%, outperforming more complex ensemble methods. Confusion matrices validated its robust performance, highlighting its precision and recall. The study underscores the potential of simple decision tree models in medical diagnosis tasks, particularly when datasets are well-preprocessed. These findings have significant implications for clinical practice, suggesting that machine learning can enhance diagnostic accuracy and efficiency. Future work will explore advanced techniques to further improve performance and integration into clinical workflows.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.