Abstract

AimsMachine learning (ML) approaches are beneficial when automatic identification of relevant features among numerous candidates is desired. We investigated the predictive ability of several ML models for new onset of diabetes mellitus. MethodsIn 10,248 subjects who received annual health examinations, 58 candidates including fatty liver index (FLI), which is calculated by using waist circumference, body mass index and levels of triglycerides and γ-glutamyl transferase, were used. ResultsDuring a 10-year follow-up period (mean period: 6.9 years), 322 subjects (6.5 %) in the training group (70 %, n=7,173) and 127 subjects (6.2 %) in the test group (30 %, n=3,075) had new onset of diabetes mellitus. Hemoglobin A1c, fasting glucose and FLI were identified as the top 3 predictors by random forest feature selection with 10-fold cross-validation. When hemoglobin A1c and FLI were used as the selected features, C-statistics analogous in receiver operating characteristic curve analysis in ML models including logistic regression, naïve Bayes, extreme gradient boosting and artificial neural network were 0.874, 0.869, 0.856 and 0.869, respectively. There was no significant difference in the discriminatory capacity among the ML models. ConclusionsML models incorporating hemoglobin A1c and FLI provide an accurate and straightforward approach for predicting the development of diabetes mellitus.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.