Abstract

Introduction: Heart disease continues to be a leading cause of death worldwide; hence early detection of the disease is recommended to reduce the fatality rate. Consequently, prediction models with high accuracy scores become important. Previous research has demonstrated encouraging results; however, concerning issues such as variations in databases and small-seized databases are still present. Hypothesis: We assessed the hypothesis that the web application which includes user input, a large database, and a machine learning model, will be able to predict the occurrence of heart disease with a high accuracy score. Methods: A web application composed of two modules, the admin and the user side, was built. The admin side consisted of a prediction model that was built based on a database and a classification algorithm. When creating the admin side, this project used data from the third visit of the multi-center population-based cohort study, the Atherosclerosis Risk in Communities (ARIC) study. There were a total of 11,351 participants on the third ARIC visit. Heart disease and other variables such as age, gender, race, education, smoking, drinking, physical activity, etc., were selected to build the prediction model of the web application. The database was split into independent ‘x’s and dependent ‘y’, and was furthermore randomly split into 75% (8,514 of 11,351) training and 25% (2,837 of 11,351) testing data. A model was created through the random forest classifier based on the trained and tested data. The user side consisted of provided fields or features that allow the user to insert his input into the web application. The user input is compared with the data used to build the web application’s prediction model. After the user inputs his information, the prediction model predicts the disease and gives the model test accuracy scores. The web application was built in Python with the help of the main packages and libraries such as streamlit, pandas, numpy, sklearn, and PIL. Results: The machine learning model achieved high accuracy in predicting heart disease. Conclusions: In conclusion, the current findings suggest that high disease prediction model test accuracy scores can be achieved when using random forest. Machine learning can be an effective tool in the early detection of heart disease and in making decisions for further prevention and treatment improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call