Abstract
BackgroundInsulin resistance is an early-stage deterioration of Type 2 diabetes. Identification and quantification of insulin resistance requires specific blood tests; however, the triglyceride-glucose (TyG) index can provide a surrogate assessment from routine Electronic Health Record (EHR) data. Since insulin resistance is a multi-factorial condition, to improve its characterisation, this study aims to discover non-trivial clinical factors in EHR data to determine where the insulin-resistance condition is encoded. MethodsWe proposed a high-interpretable Machine Learning approach (i.e., ensemble Regression Forest combined with data imputation strategies), named TyG-er. We applied three different experimental procedures to test TyG-er reliability on the Italian Federation of General Practitioners dataset, named FIMMG_obs dataset, which is publicly available and reflects the clinical use-case (i.e., not all laboratory exams are prescribed on a regular basis over time). ResultsResults detected non-conventional clinical factors (i.e., uricemia, leukocytes, gamma-glutamyltransferase and protein profile) and provided novel insight into the best combination of clinical factors for detecting early glucose tolerance deterioration. The robustness of these extracted clinical factors was confirmed by the high agreement (from 0.664 to 0.911 of Lin's correlation coefficient (rc)) of the TyG-er approach among different experimental procedures. Moreover, the results of the three experimental procedures outlined the predictive power of the TyG-er approach (up to a mean absolute error of 5.68% and rc=0.666,p<.05). ConclusionsThe TyG-er approach is able to carry information about the identification of the TyG index, strictly correlated with the insulin-resistance condition, while extracting the most relevant non-glycemic features from routine data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have