Abstract
ObjectiveHepatic encephalopathy (HE) is among the most common complications of cirrhosis. Data for cirrhosis with HE is typically unbalanced. Traditional statistical methods and machine learning algorithms thus cannot identify a few classes. In this paper, we use machine learning algorithms to construct a risk prediction model for liver cirrhosis complicated by HE to improve the efficiency of its prediction. MethodWe collected medical data from 1,256 patients with cirrhosis and performed preprocessing to extract 81 features from these irregular data. To predict HE in cirrhotic patients, we compared several classification methods: logistic regression, weighted random forest (WRF), SVM, and weighted SVM (WSVM). We also used an additional 722 patients with cirrhosis for external validation of the model. ResultsThe WRF, WSVM, and logistic regression models exhibited better recognition ability for patients with HE than traditional machine learning models (sensitivity> 0.70), but their ability to identify patients with uncomplicated HE was slightly lower (specificity approximately 85%). The comprehensive evaluation index of the traditional model was higher than those of other models (G-means> 0.80 and F-measure> 0.40). For the WRF, the G-means (0.82), F-measure (0.46), and AUC (0.82) were superior to those of the logistic regression and WSVM models, which means that it can better predict the incidence of HE in patients. ConclusionThe WRF model is more suitable for the classification of unbalanced medical data and can be used to construct a risk prediction and evaluation system for liver cirrhosis complicated with HE. The probabilistic prediction models of WRF can help clinicians identify high-risk patients with HE.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have