Abstract

The prevalence of chronic diseases poses significant challenges to public health systems worldwide. This study evaluates the performance of four machine learning models—Gradient Boosting Classifier, Support Vector Machine (SVM), Logistic Regression, and Random Forest—in classifying chronic disease indicators using the U.S. Chronic Disease Indicators (CDI) dataset. The models were assessed based on accuracy, precision, recall, F1 score, classification report, and confusion matrix to determine their effectiveness. The Gradient Boosting Classifier outperformed other models with an accuracy of 64.36%, precision of 63.72%, recall of 64.36%, and F1 score of 63.88%. While SVM and Random Forest demonstrated moderate performance, Logistic Regression served as a baseline for comparison. The study highlights the Gradient Boosting Classifier's superiority in handling the complexities of the CDI dataset, suggesting its potential for improving chronic disease prediction and management. Future research should focus on refining these models, addressing class imbalances, and incorporating domain knowledge to enhance interpretability and applicability in real-world scenarios.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.