BackgroundNon-alcoholic fatty liver disease (NAFLD) is a leading cause of liver-related morbidity and mortality. The diagnosis of non-alcoholic steatohepatitis (NASH) plays a crucial role in the management of NAFLD patients. ObjectiveThe aim of our observational study was to build a machine learning model to identify NASH in NAFLD patients. MethodsThe clinical characteristics of 259 NAFLD patients and their initial laboratory data (Cohort 1) were collected to train the model and carry out internal validation. We compared the models built by five machine learning algorithms and screened out the best models. Receiver operating characteristic (ROC) curves, sensitivity, specificity, and accuracy were used to evaluate the performance of the model. In addition, the NAFLD patients in Cohort 2 (n = 181) were externally verified. ResultsWe finally identified six independent risk factors for predicting NASH, including neutrophil percentage (NEU%), aspartate aminotransferase/alanine aminotransferase (AST/ALT), hematocrit (HCT), creatinine (CREA), uric acid (UA), and prealbumin (PA). The NASH-XGB6 model built using the XGBoost algorithm showed sufficient prediction accuracy, with ROC values of 0.95 (95 % CI, 0.91–0.98) and 0.90 (95 % CI, 0.88–0.93) in Cohort 1 and Cohort 2, respectively. ConclusionsNASH-XGB6 can serve as an effective tool for distinguishing NASH patients from NAFLD patients.
Read full abstract