Background Chronic diseases such as chronic kidney disease (CKD), chronic liver disease (CLD), tuberculosis (TB), dementia, and heart disease are global health concerns of significant importance, representing major causes of morbidity and mortality worldwide. Early diagnosis and interventions are critical to improve patient outcomes and reduce healthcare costs. Methods This prospective observational study analyzed clinical data from 270 patients (calculated using G*Power 3.1.9.7 analysis (Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany), α = 0.05, power = 0.80), with 260 (96.3%) completing the protocol. The cohort comprised 149 (55.2%) males and 121 (44.8%) females, distributed across CKD (n=55, 21.2%), CLD (n=52, 20.0%), TB (n=51, 19.6%), dementia (n=50, 19.2%), and heart disease (n=52, 20.0%). Three ML models were employed with ChatGPT version 3.5 assistance (OpenAI, San Francisco, CA, USA) in feature selection and hyperparameter optimization: logistic regression, random forest, and support vector machines. Model performance was evaluated using accuracy, sensitivity, specificity, precision, recall, F1-score, and AUC-ROC metrics. Ten-fold cross-validation was applied to ensure robustness. Results The random forest model demonstrated superior performance, achieving the highest accuracy in predicting CKD (47/55, 85.3%, p < 0.001, sensitivity 45/55, 82.5%, specificity 48/55, 87.2%) and heart disease (46/52, 88.2%, p < 0.001, sensitivity 45/52, 85.7%, specificity 47/52, 90.1%). Logistic regression effectively predicted TB (41/51, 80.1%, p < 0.01) and dementia (41/50, 82.4%, p < 0.01). Key predictive parameters included hemoglobin (median 10.2 g/dL, IQR 8.4-12.6) and erythrocyte sedimentation rate (median 42.0 mm/hr, IQR 20.0-65.0). Model validation showed high consistency, with positive acid-fast bacilli in 40/51 (78.4%) TB cases and characteristic radiological findings in 43/51 (84.3%) cases. Conclusion ML algorithms, particularly random forest, show promising potential in chronic disease classification and prediction. The integration of ChatGPT enhanced model development through optimized feature selection and hyperparameter tuning. Future research should focus on external validation through multi-center studies and prospective clinical trials.
Read full abstract