Abstract

To explore the feasibility and validity of machine learning models in determining causality in medical malpractice cases and to try to increase the scientificity and reliability of identification opinions. We collected 13,245 written judgments from PKULAW.COM, a public database. 963 cases were included after the initial screening. 21 medical and ten patient factors were selected as characteristic variables by summarising previous literature and cases. Random Forest, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) were used to establish prediction models of causality for the two data sets, respectively. Finally, the optimal model is obtained by hyperparameter tuning of the six models. We built three real data set models and three virtual data set models by three algorithms, and their confusion matrices differed. XGBoost performed best in the real data set, with a model accuracy of 66%. In the virtual data set, the performance of XGBoost and LightGBM was basically the same, and the model accuracy rate was 80%. The overall accuracy of external verification was 72.7%. The optimal model of this study is expected to predict the causality accurately.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call