Abstract

Malaysia National Health and Morbidity Survey revealed that one-fifth of Malaysian adults are diagnosed with Diabetes. It exists in different age groups and is hardly discovered especially among youths as the test could only be performed in certain places which require special equipment. It is essential to develop a tool that is capable to generate high accuracy predictions. This research underwent features selection of a secondary dataset which contains seventeen attributes, with no irrelevant data and missing values, and fed it into an AdaBoost with Decision Tree as Base Algorithm Model, Support Vector Machine (SVM), and an ensemble model developed by the machine learning knowledge. The first five most influenced features in the dataset were selected using SelectKBest for each model to conduct training and testing on the dataset and higher accuracy prediction results were achieved. The predictions from the three models were compared and the results from AdaBoost and SVM were combined in the ensemble model. A diabetes prediction prototype was developed to compare the accuracy of the three methods using the observed dataset. This research concludes the ensemble model gives the highest accuracy for Diabetes prediction and might be considered the most suitable method applied in Diabetes prediction tools.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.