Abstract
The Machine Learning (ML) models are prone to a curse of dimensionality. The dataset with a greater number of features involves more computational cost and it may lead to low performance in the context of prediction accuracy. Therefore, in this research work we have predicted diabetes with more accuracy by using a smaller number of features. The heuristic methods Sequential Forward Selection (SFS), Sequential Backward Selection (SBS) and metaheuristic evolutionary methods — Whale Optimization Algorithm (WOA) and Genetic Algorithm (GA) are used for performing feature subset selection. The Gini index is also used as a filter evaluator. The performance of the feature subsets is analyzed by applying three different types of ML models, Random Forest (RF), Multi-Layer Perceptron (MLP) and K-Nearest Neighbor (KNN). We have predicted type-2 diabetes with an accuracy of 96.82%. Also, we have reduced the number of features up to 67.44% i.e., identified 32.56% most relevant features.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Modeling, Simulation, and Scientific Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.