Predicting treatment pathways in Class II malocclusion patients using machine learning: A comparative study of four algorithms for classifying camouflage, growth modulation, and surgical decisions.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Predicting treatment pathways in Class II malocclusion patients using machine learning: A comparative study of four algorithms for classifying camouflage, growth modulation, and surgical decisions.

Similar Papers
  • Research Article
  • 10.1111/nicc.70190
Machine Learning (ML) Algorithm to Predict Mortality of Trauma Patients Admitted to the Intensive Care Units (ICU) in Northwest Ethiopia.
  • Oct 9, 2025
  • Nursing in critical care
  • Mengistu Abebe Messelu + 11 more

Predicting mortality among trauma patients is a critical task that can guide clinical decision-making, better management and resource allocation in Intensive Care Units (ICU). Machine learning has been increasingly employed in clinical practice to effectively predict the mortality of critically ill patients. This study aimed to develop and evaluate the machine learning models for predicting the mortality of trauma patients. A multicentre cross-sectional study was conducted. The data were collected retrospectively from 613 trauma patients admitted to the ICU between January 1, 2020, and December 30, 2021, in comprehensive specialised hospitals of Northwest Ethiopia. The Kampala trauma score (KTS II) and revised trauma score (RTS) were calculated for each patient on admission, and the scores range from 5 to 10 and 0 to 7.84, respectively, with lower scores indicating more severe trauma and a higher risk of mortality. Pre-processing, feature selection and model fitting were done using Python version 3.12. Seven Machine Learning (ML) models, Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), K-Nearest Neighbours (KNN), Logistic Regression (LR), Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost), were developed to predict mortality among trauma patients at the time of hospital discharge. The dataset was divided into training (80%) and testing sets (20%), and a 10-fold cross-validation technique was employed to improve model performance. The models' prediction accuracy was measured using metrics derived from the confusion matrix, such as sensitivity, specificity, precision and Receiver Operating Characteristics (ROC). Of 613 trauma patients admitted to the intensive care units, 248 (40.5%) died. Among the different variables included, the Kampala trauma score (KTS II), Glasgow Coma Scale (GCS) score and presence of complications were the most reliable features for predicting mortality among trauma patients. This study found that the Random Forest (RF) algorithm outperformed other machine learning algorithms, achieving an accuracy of 95%, sensitivity of 96%, precision of 93%, F1 score of 94% and a Receiver Operating Characteristics (ROC) score of 99%. Moreover, Support Vector Machines (SVM) and XGBoost also performed exceptionally well, with AUC scores of 0.98 and 0.97, respectively. We found the Random Forest (RF) to be the best-performing machine learning model in predicting the mortality among trauma patients. This is the first machine learning model developed specifically for mortality prediction of trauma patients in Ethiopia. The application of machine learning algorithms is warranted to stratify the risk of mortality, enabling evidence-based intervention and maximising resource utilisation. Thus, further external validation on independent data from prospective studies is needed to evaluate the universal applicability of the model to clinical practice. This study holds substantial value for clinical practice by enhancing decision support, enabling early identification of high-risk patients and supporting proactive surveillance and timely interventions. The integration of machine learning for mortality prediction is particularly impactful, as it facilitates remote monitoring and telemedicine, helping to bridge gaps in healthcare access. Additionally, it aids in optimising treatment strategies through patient-centred data, informs health planning and resource allocation, supports personalised care and advances data-driven research and policy-making.

  • Research Article
  • Cite Count Icon 2
  • 10.1097/md.0000000000038513
Performance evaluation of ML models for preoperative prediction of HER2-low BC based on CE-CBBCT radiomic features: A prospective study
  • Jun 14, 2024
  • Medicine
  • Xianfei Chen + 3 more

To explore the value of machine learning (ML) models based on contrast-enhanced cone-beam breast computed tomography (CE-CBBCT) radiomics features for the preoperative prediction of human epidermal growth factor receptor 2 (HER2)-low expression breast cancer (BC). Fifty-six patients with HER2-negative invasive BC who underwent preoperative CE-CBBCT were prospectively analyzed. Patients were randomly divided into training and validation cohorts at approximately 7:3. A total of 1046 quantitative radiomic features were extracted from CE-CBBCT images and normalized using z-scores. The Pearson correlation coefficient and recursive feature elimination were used to identify the optimal features. Six ML models were constructed based on the selected features: linear discriminant analysis (LDA), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost (AB), and decision tree (DT). To evaluate the performance of these models, receiver operating characteristic curves and area under the curve (AUC) were used. Seven features were selected as the optimal features for constructing the ML models. In the training cohort, the AUC values for SVM, LDA, RF, LR, AB, and DT were 0.984, 0.981, 1.000, 0.970, 1.000, and 1.000, respectively. In the validation cohort, the AUC values for the SVM, LDA, RF, LR, AB, and DT were 0.859, 0.880, 0.781, 0.880, 0.750, and 0.713, respectively. Among all ML models, the LDA and LR models demonstrated the best performance. The DeLong test showed that there were no significant differences among the receiver operating characteristic curves in all ML models in the training cohort (P > .05); however, in the validation cohort, the DeLong test showed that the differences between the AUCs of LDA and RF, AB, and DT were statistically significant (P = .037, .003, .046). The AUCs of LR and RF, AB, and DT were statistically significant (P = .023, .005, .030). Nevertheless, no statistically significant differences were observed when compared to the other ML models. ML models based on CE-CBBCT radiomics features achieved excellent performance in the preoperative prediction of HER2-low BC and could potentially serve as an effective tool to assist in precise and personalized targeted therapy.

  • Research Article
  • Cite Count Icon 33
  • 10.1002/mp.14699
Detecting MLC modeling errors using radiomics-based machine learning in patient-specific QA with an EPID for intensity-modulated radiation therapy.
  • Jan 27, 2021
  • Medical Physics
  • Madoka Sakai + 13 more

We sought to develop machine learning models to detect multileaf collimator (MLC) modeling errors with the use of radiomic features of fluence maps measured in patient-specific quality assurance (QA) for intensity-modulated radiation therapy (IMRT) with an electric portal imaging device (EPID). Fluence maps measured with EPID for 38 beams from 19 clinical IMRT plans were assessed. Plans with various degrees of error in MLC modeling parameters [i.e., MLC transmission factor (TF) and dosimetric leaf gap (DLG)] and plans with an MLC positional error for comparison were created. For a total of 152 error plans for each type of error, we calculated fluence difference maps for each beam by subtracting the calculated maps from the measured maps. A total of 837 radiomic features were extracted from each fluence difference map, and we determined the number of features used for the training dataset in the machine learning models by using random forest regression. Machine learning models using the five typical algorithms [decision tree, k-nearest neighbor (kNN), support vector machine (SVM), logistic regression, and random forest] for binary classification between the error-free plan and the plan with the corresponding error for each type of error were developed. We used part of the total dataset to perform fourfold cross-validation to tune the models, and we used the remaining test dataset to evaluate the performance of the developed models. A gamma analysis was also performed between the measured and calculated fluence maps with the criteria of 3%/2 and 2%/2mm for all of the types of error. The radiomic features and its optimal number were similar for the models for the TF and the DLG error detection, which was different from the MLC positional error. The highest sensitivity was obtained as 0.913 for the TF error with SVM and logistic regression, 0.978 for the DLG error with kNN and SVM, and 1.000 for the MLC positional error with kNN, SVM, and random forest. The highest specificity was obtained as 1.000 for the TF error with a decision tree, SVM, and logistic regression, 1.000 for the DLG error with a decision tree, logistic regression, and random forest, and 0.909 for the MLC positional error with a decision tree and logistic regression. The gamma analysis showed the poorest performance in which sensitivities were 0.737 for the TF error and the DLG error and 0.882 for the MLC positional error for 3%/2mm. The addition of another type of error to fluence maps significantly reduced the sensitivity for the TF and the DLG error, whereas no effect was observed for the MLC positional error detection. Compared to the conventional gamma analysis, the radiomics-based machine learning models showed higher sensitivity and specificity in detecting a single type of the MLC modeling error and the MLC positional error. Although the developed models need further improvement for detecting multiple types of error, radiomics-based IMRT QA was shown to be a promising approach for detecting the MLC modeling error.

  • Research Article
  • 10.1097/01.xcs.0000895032.09245.e2
Applying Machine Learning to Predict Esophageal Cancer Recurrence after Esophagectomy
  • Oct 17, 2022
  • Journal of the American College of Surgeons
  • Kevin C Kapcio + 7 more

INTRODUCTION: Artificial intelligence and machine learning (ML) models have recently been adapted in healthcare applications with promising results. The objective of this proof-of-concept study was to develop a ML model designed to predict esophageal cancer recurrence after esophagectomy. METHODS: We conducted a retrospective study of 260 consecutive patients who underwent esophagectomy for esophageal cancer from 2009 through 2018. Over 20,000 patient-specific characteristics were collected. Risk prediction models for different prediction windows were constructed via a sequential forward selection process. Five traditional machine learning algorithms including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and Naive Bayes (NB) were included in our analysis. Model performance was assessed by calculating sensitivity, specificity, positive predictive value (PPV), F1 score, area under the receiver operating characteristic curve (AUC), and overall accuracy using five-fold cross-validation. RESULTS: Of the 260 patients, 121 (46.5%) experienced cancer recurrence at a median of 267 days (49-2,057 days). The highest AUC for prediction of recurrence at any time after esophagectomy was 0.82 (SD ± 0.02) by DT. Similarly, the accuracy of the model to predict whether or not a new subject inserted into the model will have recurrence at any time was 82% (0.82 ± 0.03) by RF. The LR algorithm had the highest accuracy (92%, 0.92 ± 0.01) for predicting recurrence within 180 days after esophagectomy (Table). CONCLUSION: This proof-of-concept study demonstrates the feasibility of using machine learning models to predict esophageal cancer recurrence. The accuracy and AUC of the machine learning models exceeded 80% in all recurrence prediction timeframes. Table. - Overall Performance of the Machine Learning Models Prediction window No. (%) of subjects with recurrence/no. (%) of subjects without recurrence Model Accuracy (SD) AUC (SD) Recurrence at any time after esophagectomy 121 (46.5) / 139 (53.5) LR 0.80 (0.05) 0.81 (0.04) SVM 0.77 (0.05) 0.79 (0.06) RF 0.82 (0.03) 0.82 (0.05) DT 0.79 (0.03) 0.82 (0.02) NB 0.80 (0.04) 0.78 (0.06) Recurrence within 180 days after esophagectomy 37 (14.2) / 223 (85.8) LR 0.92 (0.01) 0.78 (0.05) SVM 0.92 (0.02) 0.81 (0.07) RF 0.90 (0.03) 0.73 (0.10) DT 0.90 (0.02) 0.70 (0.04)

  • Research Article
  • 10.1177/17588359251355058
Establishment and validation of a convenient and efficient screening tool for active pulmonary tuberculosis in lung cancer patients based on common parameters.
  • Jul 1, 2025
  • Therapeutic advances in medical oncology
  • Fan Zhang + 9 more

Coexistent pulmonary tuberculosis and lung cancer (PTB-LC) is a rare type of disease with frequent under- and/or mis-diagnosis. Establishment of a reliable screening model for PTB-LC holds considerable medical and economic significance. We aimed to develop an efficient and convenient tool to identify high-risk individuals for tuberculosis (TB) infection among LC patients based on commonly available parameters in clinical practice. This study consisted of a primary retrospective patient cohort for model construction and verification, and a prospective patient cohort for prospective validation. Patients with active PTB-LC and LC diagnosed in Beijing Chest Hospital from 2018 to 2022 were collected and 1:1 matched according to time of admission and were classified into a training set (n = 281) and testing set (n = 121). Baseline information, clinicopathological features, imaging manifestations, and blood testing results were collected and analyzed. Five machine learning methods, including logistic regression (LR), random forest (RF), support vector machine (SVM), decision tree (DT), and neural network (NN), were employed to develop a screening model for PTB-LC. Through multivariable analysis, gender, pleural effusion, cavitation, monocyte count (MONO), and plasma adenosine deaminase (ADA) levels were identified as independent predictors of PTB-LC and included in model construction. LR, RF, SVM, DT, and NN were used to construct the screening or pre-diagnosis models. The RF demonstrated the best performance with an area under the curve of 0.966 in the training set, 0.817 in the testing set, and 0.805 in the prospective dataset. The accuracy, precision, recall, and F1 score of the RF model of the training set were 0.88, 0.87, 0.89, and 0.88, respectively, and these indicators of the testing set were 0.71, 0.75, 0.72, and 0.74, respectively, which were superior to those of other methods. The prospective cohort further validated the good performance of the screening model. We also established a nomogram with gender, pleural effusion, cavitation, MONO, and serum ADA in assessing high-risk patients of developing TB infection. Further TB-related diagnostic tests were recommended for these high-risk patients. The RF screening model constructed with gender, pleural effusion, cavitation, MONO, and ADA may help identify high-risk patients of PTB-LC from LC alone cases.

  • Research Article
  • Cite Count Icon 6
  • 10.2214/ajr.23.29579
Machine Learning Using Presentation CT Perfusion Imaging for Predicting Clinical Outcomes in Patients With Aneurysmal Subarachnoid Hemorrhage.
  • Oct 11, 2023
  • AJR. American journal of roentgenology
  • Pengzhan Yin + 5 more

BACKGROUND. Prediction of outcomes in patients with aneurysmal subarachnoid hemorrhage (aSAH) is challenging using current clinical predictors. OBJECTIVE. The purpose of our study was to evaluate the utility of machine learning (ML) models incorporating presentation clinical and CT perfusion imaging (CTP) data in predicting delayed cerebral ischemia (DCI) and poor functional outcome in patients with aSAH. METHODS. This study entailed retrospective analysis of data from 242 patients (mean age, 60.9 ± 11.8 [SD] years; 165 women, 77 men) with aSAH who, as part of a prospective trial, underwent CTP followed by standardized evaluation for DCI during initial hospitalization and poor 3-month functional outcome (i.e., modified Rankin scale score ≥ 4). Patients were randomly divided into training (n = 194) and test (n = 48) sets. Five ML models (k-nearest neighbor [KNN], logistic regression [LR], support vector machine [SVM], random forest [RF], and category boosting [CatBoost]) were developed for predicting outcomes using presentation clinical and CTP data. The least absolute shrinkage and selection operator method was used for feature selection. Ten-fold cross-validation was performed in the training set. Traditional clinical models were developed using stepwise LR analysis of clinical, but not CTP, data. RESULTS. Qualitative CTP analysis was identified as the most impactful feature for both outcomes. In the test set, the traditional clinical model, KNN, LR, SVM, RF, and CatBoost showed AUC for predicting DCI of 0.771, 0.812, 0.824, 0.908, 0.930, and 0.949, respectively, and AUC for predicting poor 3-month functional outcome of 0.863, 0.858, 0.879, 0.908, 0.926, and 0.958. CatBoost was selected as the optimal model. In the test set, AUC was higher for CatBoost than for the traditional clinical model for predicting DCI (p = .004) and poor 3-month functional outcome (p = .04). In the test set, sensitivity and specificity for predicting DCI were 92.3% and 60.0% for the traditional clinical model versus 92.3% and 85.7% for CatBoost, and sensitivity and specificity for predicting poor 3-month functional outcome were 100.0% and 65.8% for the traditional clinical model versus 90.0% and 94.7% for CatBoost. A web-based prediction tool based on CatBoost was created. CONCLUSION. ML models incorporating presentation clinical and CTP data outperformed traditional clinical models in predicting DCI and poor 3-month functional outcome. CLINICAL IMPACT. ML models may help guide early management of patients with aSAH.

  • Research Article
  • 10.1093/jbcr/iraf019.053
53 Utilizing a Machine Learning Approach for the Prediction of In-Hospital Mortality After Thermal Burn
  • Apr 1, 2025
  • Journal of Burn Care & Research
  • Tuan Le + 5 more

Introduction Burn injury is a devastating form of trauma that can lead to long-term poor outcomes and death. Early and accurate mortality prediction is crucial for determining resuscitation status and determining appropriateness of care. This is especially important in situations of mass causalities where triage and resource availability are depleted quickly. This study focuses on the role of machine learning (ML) in predicting mortality, a system that has been increasingly used and proven effective in predicting clinical outcomes. The study’s aim was to identify ML models with the best diagnostic performance for predicting mortality in patients with burn injury. Methods A retrospective observational study of 115 patients admitted to a regional burn center within 4 hours of thermal injury was conducted. Eighty-four features were selected, including patient demographic data, vital signs, injury characteristics- (%TBSA, inhalation injury, GCS, revised Baux score), CBC, vital signs, rapid thromboelastography, serum cytokine levels, coagulation markers, and clinical outcomes (e.g. mortality and hospital length of stay). Six ML models, including decision tree (DT), gradient boosting (GB), logistic regression (LR), neural network (NN), random forest (RF), and support vector machine (SVM) were identified to predict mortality. The ML models were compared by the area under the receiver operating characteristics curve (AUC) and Kolmogorov-Smirnov statistics (KS; Youden) for the best diagnostic performance for predicting mortality in burn patients, ensuring a comprehensive and rigorous approach to the analysis. Results Of 115 patients studied, most were male (68.7%) with a median age of 40 years and TBSA of 11.8%, and 13 resulted in mortality (11.3%). The AUC values for the DT, GB, LR, NN, RF, and SVM models were 0.95, 0.99, 0.55, 0.99, 0.99, and 0.99, respectively. The KS values were 0.99, 0.99, 0.91, 0.99, 0.99, and 0.99, respectively. The SVM shows superior AUC and KS among these models, with sensitivity and specificity of 88% and 100%, respectively. The five essential predictors identified from the 20 most important variables for the champion model (Table) were revised Baux score, %TBSA, TNF-R1, PAP, and IL-1b. Conclusions The SVM was the best-performing ML model (highest AUC and KS without overfitting), making it more reliable in predicting mortality. These findings, with the potential to significantly impact clinical practice, underline the importance of this research in burn injury management. Applicability of Research to Practice Machine learning models, which have important applications in predicting patient outcomes, can provide real-time predictions of mortality and recovery, helping clinicians make more informed decisions in medical management. Funding for the Study DoD Award Numbers W911NF-10-1-0459 and W911QY-15-C-0025

  • Research Article
  • Cite Count Icon 1
  • 10.3389/fendo.2025.1486350
Machine learning applications to classify and monitor medication adherence in patients with type 2 diabetes in Ethiopia.
  • Mar 20, 2025
  • Frontiers in endocrinology
  • Ewunate Assaye Kassaw + 3 more

Medication adherence plays a crucial role in determining the health outcomes of patients, particularly those with chronic conditions like type 2 diabetes. Despite its significance, there is limited evidence regarding the use of machine learning (ML) algorithms to predict medication adherence within the Ethiopian population. The primary objective of this study was to develop and evaluate ML models designed to classify and monitor medication adherence levels among patients with type 2 diabetes in Ethiopia, to improve patient care and health outcomes. Using a random sampling technique in a cross-sectional study, we obtained data from 403 patients with type 2 diabetes at the University of Gondar Comprehensive Specialized Hospital (UoGCSH), excluding 13 subjects who were unable to respond and 6 with incomplete data from an initial cohort of 422. Medication adherence was assessed using the General Medication Adherence Scale (GMAS), an eleven-item Likert scale questionnaire. The responses served as features to train and test machine learning (ML) models. To address data imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied. The dataset was split using stratified K-fold cross-validation to preserve the distribution of adherence levels. Eight widely used ML algorithms were employed to develop the models, and their performance was evaluated using metrics such as accuracy, precision, recall, and F1 score. The best-performing model was subsequently deployed for further analysis. Out of 422 enrolled patients, 403 data samples were collected, with 11 features extracted from each respondent. To mitigate potential class imbalance, the dataset was increased to 620 samples using the Synthetic Minority Over-sampling Technique (SMOTE). Machine learning models including Logistic Regression (LR), Support Vector Machine (SVM), K Nearest Neighbor (KNN), Decision Tree (DT), Random Forest (RF), Gradient Boost Classifier (GBC), Multilayer Perceptron (MLP), and 1D Convolutional Neural Network (1DCNN) were developed and evaluated. Although the performance differences among the models were subtle (within a range of 0.001), the SVM classifier outperformed the others, achieving a recall of 0.9979 and an AUC of 0.9998. Consequently, the SVM model was selected for deployment to monitor and detect patients' medication adherence levels, enabling timely interventions to improve patient outcomes. This study highlights a variety of machine learning (ML) models that can be effectively used to monitor and classify medication adherence in diabetic patients in Ethiopia. However, to fully realize the potential impact of digital health applications, further studies that include patients from diverse settings are necessary. Such research could enhance the generalizability of these models and provide insights into the broader applicability of digital tools for improving medication adherence and patient outcomes in varying healthcare contexts.

  • Research Article
  • Cite Count Icon 1
  • 10.1007/s11604-023-01475-2
Utility of machine learning for identifying stapes fixation on ultra-high-resolution CT.
  • Aug 10, 2023
  • Japanese journal of radiology
  • Ruowei Tang + 8 more

Imaging diagnosis of stapes fixation (SF) is challenging owing to a lack of definite evidence. We developed a comprehensive machine learning (ML) model to identify SF on ultra-high-resolution CT. We retrospectively enrolled 109 participants (143 ears) and divided them into the training set (115 ears) and test set (28 ears). Stapes mobility (SF or non-SF) was determined by surgical inspection. In the ML analysis, rectangular regions of interest were placed on consecutive axial slices in the training set. Radiomic features were extracted and fed into the training session. The test set was analyzed using 7 ML models (support vector machine, k nearest neighbor, decision tree, random forest, extra trees, eXtreme Gradient Boosting, and Light Gradient Boosting Machine) and by 2 dedicated neuroradiologists. Diagnostic performance (sensitivity, specificity and accuracy, with surgical findings as the reference) was compared between the radiologists and the optimal ML model by using the McNemar test. The mean age of the participants was 42.3 ± 17.5years. The Light Gradient Boosting Machine (LightGBM) model showed the highest sensitivity (0.83), specificity (0.81), accuracy (0.82) and area under the curve (0.88) for detecting SF among the 7 ML models. The neuroradiologists achieved good sensitivities (0.75 and 0.67), moderate-to-good specificities (0.63 and 0.56) and good accuracies (0.68 and 0.61). This model showed no statistical differences with the neuroradiologists (P values 0.289-1.000). Compared to the neuroradiologists, the LightGBM model achieved competitive diagnostic performance in identifying SF, and has the potential to be a supportive tool in clinical practice.

  • Research Article
  • Cite Count Icon 3
  • 10.1002/lary.31439
Machine Learning for Predictive Analysis of Otolaryngology Residency Letters of Recommendation.
  • Apr 11, 2024
  • The Laryngoscope
  • Vikram Vasan + 6 more

Letters of recommendation (LORs) are a highly influential yet subjective and often enigmatic aspect of the residency application process. This study hypothesizes that LORs do contain valuable insights into applicants and can be used to predict outcomes. This pilot study utilizes natural language processing and machine learning (ML) models using LOR text to predict interview invitations for otolaryngology residency applicants. A total of 1642 LORs from the 2022-2023 application cycle were retrospectively retrieved from a single institution. LORs were preprocessed and vectorized using three different techniques to represent the text in a way that an ML model can understand written prose: CountVectorizer (CV), Term Frequency-Inverse Document Frequency (TF-IDF), and Word2Vec (WV). Then, the LORs were trained and tested on five ML models: Logistic Regression (LR), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM). Of the 337 applicants, 67 were interviewed and 270 were not interviewed. In total, 1642 LORs (26.7% interviewed) were analyzed. The two best-performing ML models in predicting interview invitations were the TF-IDF vectorized DT and CV vectorized DT models. This preliminary study revealed that ML models and vectorization combinations can provide better-than-chance predictions for interview invitations for otolaryngology residency applicants. The high-performing ML models were able to classify meaningful information from the LORs to predict applicant interview invitation. The potential of an automated process to help predict an applicant's likelihood of obtaining an interview invitation could be a valuable tool for training programs in the future. N/A Laryngoscope, 134:4016-4022, 2024.

  • Research Article
  • Cite Count Icon 2
  • 10.1186/s41043-024-00647-8
Prediction and feature selection of low birth weight using machine learning algorithms
  • Oct 12, 2024
  • Journal of Health, Population and Nutrition
  • Tasneem Binte Reza + 1 more

Background and aimsThe birth weight of a newborn is a crucial factor that affects their overall health and future well-being. Low birth weight (LBW) is a widespread global issue, which the World Health Organization defines as weighing less than 2,500 g. LBW can have severe negative consequences on an individual’s health, including neonatal mortality and various health concerns throughout their life. To address this problem, this study has been conducted using BDHS 2017–2018 data to uncover important aspects of LBW using a variety of machine learning (ML) approaches and to determine the best feature selection technique and best predictive ML model.MethodsTo pick out the key features, the Boruta algorithm and wrapper method were used. Logistic Regression (LR) used as traditional method and several machine learning classifiers were then used, including, DT (Decision Tree), SVM (Support Vector Machine), NB (Naïve Bayes), RF (Random Forest), XGBoost (eXtreme Gradient Boosting), and AdaBoost (Adaptive Boosting), to determine the best model for predicting LBW. The model’s performance was evaluated based on the specificity, sensitivity, accuracy, F1 score and AUC value.ResultsResult shows, Boruta algorithm identifies eleven significant features including respondent’s age, highest education level, educational attainment, wealth index, age at first birth, weight, height, BMI, age at first sexual intercourse, birth order number, and whether the child is a twin. Incorporating Boruta algorithm’s significant features, the performance of traditional LR and ML methods including DT, SVM, NB, RF, XGBoost, and AB were evaluated where LR, had a specificity, sensitivity, accuracy and F1 score of 0.85, 0.5, 85.15% and 0.915. While the ML methods DT, SVM, NB, RF, XGBoost, and AB model’s respective accuracy values were 85.35%, 85.15%, 84.54%, 81.18%, and 84.41%. Based on the specificity, sensitivity, accuracy, F1 score and AUC, RF (specificity = 0.99, sensitivity = 0.58, accuracy = 85.86%, F1 score = 0.9243, AUC = 0.549) outperformed the other methods. Both the classical (LR) and machine learning (ML) models’ performance has improved dramatically when important characteristics are extracted using the wrapper method. The LR method identified five significant features with a specificity, sensitivity, accuracy and F1 score of 0.87, 0.33, 87.12% and 0.9309. The region, whether the infant is a twin, and cesarean delivery were the three key features discovered by the DT and RF models, which were implemented using the wrapper technique. All three models had the identical F1 score of 0.9318. However, “child is twin” was recognized as a significant feature by the SVM, NB, and AB models, with an F1 score of 0.9315. Ultimately, with an F1 score of 0.9315, the XGBoost model recognized “child is twin” and “age at first sex” as relevant features. Random Forest again beat the other approaches in this instance.ConclusionsThe study reveals Wrapper method as the optimal feature selection technique. The ML method outperforms traditional methods, with Random Forest (RF) being the most effective predictive model for Low-Birth-Weight prediction. The study suggests that policymakers in Bangladesh can mitigate low birth weight newborns by considering identified risk factors.

  • Research Article
  • 10.1186/s12889-025-21658-y
Development of a machine learning model related to explore the association between heavy metal exposure and alveolar bone loss among US adults utilizing SHAP: a study based on NHANES 2015–2018
  • Feb 4, 2025
  • BMC Public Health
  • Jiayi Chen

BackgroundAlveolar bone loss (ABL) is common in modern society. Heavy metal exposure is usually considered to be a risk factor for ABL. Some studies revealed a positive trend found between urinary heavy metals and periodontitis using multiple logistic regression and Bayesian kernel machine regression. Overfitting using kernel function, long calculation period, the definition of prior distribution and lack of rank of heavy metal will affect the performance of the statistical model. Optimal model on this topic still remains controversy. This study aimed: (1) to develop an algorithm for exploring the association between heavy metal exposure and ABL; (2) filter the actual causal variables and investigate how heavy metals were associated with ABL; and (3) identify the potential risk factors for ABL.MethodsData were collected from National Health and Nutrition Examination Survey (NHANES) between 2015 and 2018 to develop a machine learning (ML) model. Feature selection was performed using the Least Absolute Shrinkage and Selection Operator (LASSO) regression with 10-fold cross-validation. The selected data were balanced using the Synthetic Minority Oversampling Technique (SMOTE) and divided into a training set and testing set at a 3:1 ratio. Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), Decision Tree (DT), and XGboost were used to construct the ML model. Accuracy, Area Under the Receiver Operating Characteristic Curve (AUC), Precision, Recall, and F1 score were used to select the optimal model for further analysis. The contribution of the variables to the ML model was explained using the Shapley Additive Explanations (SHAP) method.ResultsRF showed the best performance in exploring the association between heavy metal exposure and ABL, with an AUC (0.88), accuracy (0.78), precision (0.76), recall (0.83), and F1 score (0.79). Age was the most important factor in the ML model (mean| SHAP value| = 0.09), and Cd was the primary contributor. Sex had little effect on the ML model contribution.ConclusionIn this study, RF showed superior performance compared with the other five algorithms. Among the 12 heavy metals, Cd was the most important factor in the ML model. The relationship of Co & Pb and ABL are weaker than that of Cd. Among all the independent variables, age was considered the most important factor for this model. As for PIR, low-income participants present association with ABL. Mexican American and Non-Hispanic White show low association with ABL compared to Non-Hispanic Black and other races. Gender feature demonstrates a weak association with ABL. In the future, more advanced algorithms should be developed to validate these results and related parameters can be tuned to improve the accuracy of the model.Clinical trial numbernot applicable.

  • Research Article
  • 10.1007/s11596-025-00017-3
Deep Learning-Based Diagnostic Model for Parkinson's Disease Using Handwritten Spiral and Wave Images.
  • Mar 3, 2025
  • Current medical science
  • K Aditya Shastry

To develop and validate a deep neural network (DNN) model for diagnosing Parkinson's Disease (PD) using handwritten spiral and wave images, and to compare its performance with various machine learning (ML) and deep learning (DL) models. The study utilized a dataset of 204 images (102 spiral and 102 wave) from PD patients and healthy subjects. The images were preprocessed using the Histogram of Oriented Gradients (HOG) descriptor and augmented to increase dataset diversity. The DNN model was designed with an input layer, three convolutional layers, two max-pooling layers, two dropout layers, and two dense layers. The model was trained and evaluated using metrics such as accuracy, sensitivity, specificity, and loss. The DNN model was compared with nine ML models (random forest, logistic regression, AdaBoost, k-nearest neighbor, gradient boost, naïve Bayes, support vector machine, decision tree) and two DL models (convolutional neural network, DenseNet-201). The DNN model outperformed all other models in diagnosing PD from handwritten spiral and wave images. On spiral images, the DNN model achieved accuracies of 41.24% over naïve Bayes, 31.24% over decision tree, and 27.9% over support vector machine. On wave images, the DNN model achieved accuracies of 40% over naïve Bayes, 36.67% over decision tree, and 30% over support vector machine. The DNN model demonstrated significant improvements in sensitivity and specificity compared to other models. The DNN model significantly improves the accuracy of PD diagnosis using handwritten spiral and wave images, outperforming several ML and DL models. This approach offers a promising diagnostic tool for early PD detection and provides a foundation for future work to incorporate additional features and enhance detection accuracy.

  • Research Article
  • Cite Count Icon 6
  • 10.2147/rmhp.s346856
Comparison Between Statistical Model and Machine Learning Methods for Predicting the Risk of Renal Function Decline Using Routine Clinical Data in Health Screening
  • Apr 26, 2022
  • Risk Management and Healthcare Policy
  • Xia Cao + 4 more

PurposeUsing machine learning method to predict and judge unknown data offers opportunity to improve accuracy by exploring complex interactions between risk factors. Therefore, we evaluate the performance of machine learning (ML) algorithms and to compare them with logistic regression for predicting the risk of renal function decline (RFD) using routine clinical data.Patients and MethodsThis retrospective cohort study includes datasets from 2166 subjects, aged 35–74 years old, provided by an adult health screening follow-up program between 2010 and 2020. Seven different ML models were considered – random forest, gradient boosting, multilayer perceptron, support vector machine, K-nearest neighbors, adaptive boosting, and decision tree - and were compared with standard logistic regression. There were 24 independent variables, and the baseline estimate glomerular filtration rate (eGFR) was used as the predictive variable.ResultsA total of 2166 participants (mean age 49.2±11.2 years old, 63.3% males) were enrolled and randomly divided into a training set (n=1732) and a test set (n=434). The area under receiver operating characteristic curve (AUROC) for detecting RFD corresponding to the different models were above 0.85 during the training phase. The gradient boosting algorithms exhibited the best average prediction accuracy (AUROC: 0.914) among all algorithms validated in this study. Based on AUROC, the ML algorithms improved the RFD prediction performance, compared to logistic regression model (AUROC:0.882), except the K-nearest neighbors and decision tree algorithms (AUROC:0.854 and 0.824, respectively). However, the improvement differences with logistic regression were small (less than 4%) and nonsignificant.ConclusionOur results indicate that the proposed health screening dataset-based RFD prediction model using ML algorithms is readily applicable, produces validated results. But logistic regression yields as good performance as ML models to predict the risk of RFD with simple clinical predictors.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.ijmedinf.2025.105888
Comparison of machine learning and logistic regression models for predicting emergence delirium in elderly patients: A prospective study.
  • Jul 1, 2025
  • International journal of medical informatics
  • Yufan Lu + 7 more

Comparison of machine learning and logistic regression models for predicting emergence delirium in elderly patients: A prospective study.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon