Abstract

Early detection of at-risk patients has great importance in Intensive Care Units (ICUs) to improve patient healthcare and save patients’ life outcomes. The severity of illness scores have been used for predicting the patients’ risk of mortality. However, their poor accuracy is a weakness. Thus, Machine Learning (ML) models are exploited for decision support for this goal. Some challenges have to be overcome to achieve accurate predictions of the risk of mortality – for instance, finding important medical measurements or features that influence the prediction. Imbalanced class distribution is a major obstacle (i.e., the number of patients with risk is much less than the patients without), which produces the so-called accuracy paradox problem. Researchers of the related work applied ML models and different methods in order to handle those challenges. However, the important details and comparison between different methods are still missing. Hence, this thesis presents an overview of implementing the main building block of this medical decision support. It leverages the ensemble ML model, the Gradient Boosting Decision Tree (GBDT). The GBDT shows its performance even with the imbalanced data. Moreover, this thesis provides detailed steps for implementing the model and for pre-processing the data. Comparisons between different ML models, methods of feature selection, and handling imbalanced data are provided and tested on a real-world ICU dataset. Furthermore, an efficient cluster-based under-sampling method to handle imbalanced data is implemented. Predicting the risk of mortality in the related work is generic (i.e., for patients with different diseases). Some works are done on predicting mortality based on patients similarity on a large number of features (which has a weakness of High computational time and complexity). In this thesis, an approach to avoid this computational complexity and for optimizing the prediction accuracy of predicting the risk is represented and implemented. This approach is based on mortality prediction for similar patients with the same disease classification. This thesis work is compared to the related works and the commonly used severity of illness score and verified on another ICU dataset. The result shows the significant performance improvement over the severity scores and the related works and the high accuracy on the other dataset. Moreover, the achieved result – specifically, the high prediction performance of the critical cases of patients at risk (i.e., the rare cases of the minority class) – is promising. Area under the curve (AUC) of 0.956 is achieved.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.