Stroke is a significant cause of mortality and morbidity worldwide, and early detection and prevention of stroke are essential for improving patient outcomes. Machine learning algorithms have been used in recent years to predict the risk of stroke by leveraging large amounts of clinical and demographic data. The development of a stroke prediction system using Random Forest machine learning algorithm is the main objective of this thesis. The primary goal of the project is to increase the accuracy of stroke detection while addressing the shortcomings of the current system, which include real- time deployment and interpretability issues with logistic regression. The development and use of an ensemble machine learning-based stroke prediction system, performance optimization through the use of ensemble machine learning algorithms, performance assessment, and real-time model deployment through the use of Python Django are among the goals of the research. The study's potential to improve public health by lessening the severity and consequences of strokes through early diagnosis and treatment makes it significant. Data collection, preprocessing, model selection, evaluation, and real-time deployment using Python Django are all part of the research technique. Our dataset consists of 5110 rows of tuples and columns with total size of 69kg. The performance of our stroke prediction algorithm was evaluated using confusion metrics-consisting of accuracy, precision, recall and F1-score. At the end of the research, Random Forest model gave an accuracy of 98.5% compared to the existing model logistic regression which has 86% accuracy.
Read full abstract