Abstract

In this research, the authors found that statistical analysis is very important preliminary phase in Machine Learning, especially for regression problems. Indeed, when the authors developed the first single models using the same algorithms and the same dataset, they obtained poor performances. After verifying the assumptions of the multiple linear regression, they adjusted the used data and produced efficient models. Moreover, as the objective was to apply the stacking model to predict Patient's Length of Stay in a semi urban hospital, the results showed that the stacking regressor performed better than the seven different models implemented (Random Forest, Extra Trees, Decision Tree, XGBoost, Multilayer perceptron, Light GBM, Support Vector Regressor (SVR)) taken individually. The authors combined Random Forest Regressor, Extra Trees Regressor, Decision Tree Regressor, XGBoost, Light GBM, and SVR to build the stacking model. Using secondary data from four services (Pediatrics, Hospitalization, Gynecology, and Neonatology) of a semi-urban hospital, located in a region of ongoing war in eastern Democratic Republic of Congo (DRC), the study examined the minimum length of stay of a patient in hospital when admitted in one of the four above services. Performances were evaluated using MAE, RMSE, MSE, R-squared and Accuracy. The stacking regression model shifted from 85% of accuracy before statistical analysis phase to 91% after applying statistics and from 0.75 to 0.91 as R-squared

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call