Abstract
In today's world the data plays an indispensable role. The proper understanding of data and its interpretation lays the foundation for the growth and also the success of company or an organization. As in domains such as business, finance and banking, health sector also produces huge amounts of data. This data needs to be properly analyzed and summarized before the data is modeled for a specific purpose. Generally, clinical data involves stakeholders like doctors, technicians, lab analysts, hospital managers, care providers and insurance agents. Exploratory Data Analysis plays an important role in providing the complete picture of the dataset along with identifying new insights and hidden patterns in the data. As such it becomes the most significant step before actually preprocessing the data. In our paper we have implemented EDA on Statlog heart disease dataset to identify the important variables, correlations between any variables, missing values, outliers and PCA. To verify, whether the process of EDA actually impacts the performance we have utilized machine learning algorithms like Naïve Bayes, Logistic regression, Decision Tree, Support Vector Machine, Random forest. Results indicate that the performance of the prediction model considerably increases after performing EDA regardless of the type of prediction algorithm used. Also the analysis of the dataset with graphical results helps the stakeholders to make better decisions regarding their patients and their treatments. Understanding any clinical data before modeling would prevent erroneous models later and exploratory analysis helps in achieving it.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Recent Technology and Engineering (IJRTE)
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.