Abstract
The primary objective of this research is to forecast stroke occurrence on an individual patient level. Through exploratory data analysis, the study has brought to light noteworthy disparities in the distribution of stroke and non-stroke cases, shedding light on the influence of diverse health and lifestyle factors on stroke susceptibility. This project underscores the immense potential of machine learning in the realm of medical prediction, serving to aid patients in risk assessment and aiding medical practitioners in devising treatment strategies. Concerning the predictive models employed, this research leveraged two distinct models, namely RandomForest and DecisionTree. Additionally, it utilized evaluation metrics such as the Confusion Matrix, Receiver Operator Characteristic (ROC) curve, and Precision-Recall curve, each of which provided comprehensive insights into the performance of the prediction models. One noteworthy aspect of this study is the presence of missing data within certain features, underscoring the challenges posed by data gaps in medical prediction and the imperative need for effective methods to handle missing data. The experimental outcomes unveiled an Area Under the Curve (AUC) of 95% for RandomForest and 92% for DecisionTree, indicating robust predictive capabilities. Future endeavors may concentrate on refining prediction models, achieving greater balance, and expanding the dataset to enhance prediction precision.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have