Abstract

Detection of disease at earlier stages is the most challenging one. Datasets of different diseases are available online with different number of features corresponding to a particular disease. Many dimensionalities reduction and feature extraction techniques are used nowadays to reduce the number of features in dataset and finding the most appropriate ones. This paper explores the difference in performance of different machine learning models using Principal Component Analysis dimensionality reduction technique on the datasets of Chronic kidney disease and Cardiovascular disease. Further, the authors apply Logistic Regression, K Nearest Neighbour, Naïve Bayes, Support Vector Machine and Random Forest Model on the datasets and compare the performance of the model with and without PCA. A key challenge in the field of data mining and machine learning is building accurate and computationally efficient classifiers for medical applications. With an accuracy of 100% in chronic kidney disease and 85% for heart disease, KNN classifier and logistic regression were revealed to be the most optimal method of predictions for kidney and heart disease respectively.

Highlights

  • Kidney and heart are the main organs in the human body and require extra care and attention to remain healthy

  • Principal Component Analysis (PCA) principal component analysis is 97.5% which is lowest compared to other models i.e. KNN has 99% accuracy, Naïve Bayes accuracy is 99%, support vector machine (SVM) has accuracy of 100% and the random forest accuracy level is 100%

  • The Naïve Bayes has the least level of accuracy i.e. 84% compared to other models wherein logistic regression has accuracy value of 86%, and random forest, SVM and KNN value for 85%

Read more

Summary

Introduction

Kidney and heart are the main organs in the human body and require extra care and attention to remain healthy In this era of modernization where humans are exposed to polluted air, bad lifestyle, consumption of packaged food high in transfat, and more interaction with the electronic gadgets rather than family members, friends and relatives, the prevalence of chronic kidney disease and cardiovascular disease is increasing tremendously. A study claimed that the modern bedrock of artificial intelligence is machine learning that could predict the occurrence of heart attack with an accuracy of more than 90 percent. Identification of patterns was done correlating the variables to heart attack with an accuracy of more than 90 percent .The use of risk scores is done by doctors to make decisions during treatment. Results of the analysis reveal that random forest algorithm with 12 attribute can detect CKD with accuracy of 99.8% using F1-measure model and 0.107 root mean square error

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.