Comparison of Data Mining Techniques on Stroke Clinical Dataset

Viko Pradana Prasetyo,Arif Djunaidy,Makhi Hakim Hakiki,Retno Aulia Vinarti,Muhammad Fajrul Alam Ulin Nuha

doi:10.1016/j.procs.2024.03.033

Abstract

Stroke is a significant cause of mortality and morbidity worldwide, making it essential to identify individuals at risk of experiencing a stroke. The aim of this research article is to develop a predictive model to determine the stroke risk of individuals based on their medical history and compare the effectiveness of preprocessing techniques on the model's performance. The methodology involves two streams of analysis - with and without data preprocessing - utilizing classification models to predict stroke risk (K-Nearest Neighbor, Decision Tree and Support Vector Machine). The results indicate that data preprocessing improves the performance of all models, with KNN and SVM showing high precision and recall values, making them effective models for predicting strokes. Conversely, the decision tree model performs well with data preprocessing despite slightly lower accuracy and recall values. These findings suggest that preprocessing is a crucial stage in machine learning and can enhance the performance of classification models in predicting stroke risk.

Full Text