This study aimed to build a model-based predictive approach to evaluate the gastrointestinal side effects following an initial metformin medication. The model was developed from data from four randomised clinical cohorts. A prediction model was established using integrated or simplified indicators. Ten machine learning models were used for the construction of predictive models. The Shapley values were used to report the features' contribution. Four randomised clinical trial cohorts, including 1736 patients with type 2 diabetes, were first included in the analysis. Seventy percent of participants (1216) were allocated to the training set, 15% (260) were assigned to the internal validation set and 15% (260) were assigned to the test set. The Extra Tree model had the highest area under curve (AUC) (0.87) in the validation and test set. The top five crucial indicators were blood urea nitrogen (BUN), sex, triglyceride (TG), high-density lipoprotein-cholesterol (HDL-C) and total cholesterol (TC), and these five indicators were selected for constructing a simplified predictive model (AUC = 0.76). An online web-based tool was established based on the predictive model with integrated 17 features and top five indicators. To predict gastrointestinal side effects in diabetic patients for initial use of metformin, a few easily obtained features are needed to establish the model. The model can be applied to the Chinese population in clinical practice.
Read full abstract