Abstract

Background: High dimensional datasets contain the curse of dimensionality, and hence data mining becomes a more difficult task. Feature selection in the knowledge data and discovery process provides a solution for this curse of dimensionality issue and helps the classification task reduce the time complexity and improve the accuracy. Objectives: This paper aims to recognize a bio-inspired algorithm that best suits feature selection and utilizes optimized feature selection techniques. This algorithm is used to design machine learning classifiers that are suitable for multiple datasets and for both high dimensional datasets, moreover to carry out performance analysis with regards to the accuracy of a classification and the processing time for classification. Methods: This study employs an improved form of grasshopper optimization algorithm to perform feature selection task. Evolutionary outlay aware deep belief network is used to perform the classification task. Findings: In this research, 20 UCI benchmark data sets are taken with full 60 features and 30000 instances. The datasets are Mammography, Monks-1, Bupa, Credit, Parkinson's, Monk-2, Sonar, Ecoli, Prognostic, Ionosphere, Monk-3, Yeast, Car, Blood, Pima, Spect, Vert, Prognostic, Contraceptive, and Tic-Tac-Toe endgame. Table 1 describes the dataset details, number of instances, datasets and features. The overall performance is performed using MATLAB 6.0 tool, which runs on Microsoft Windows 8, and the configuration is Core 13 processor with 1 TB hard disk and 8GB RAM. Performance standards, like classification accuracy and the processing time for classification, is achieved. Novelty: Interestingly, the Improved Grasshopper Optimization Algorithm uses error rate and classification accuracy of the Evolutionary Outlay Aware –Deep Belief Network Classifier as fitness function values. This combined work of classification and feature selection is briefly represented as IGOA-EOA-DBNC. Twenty datasets are selected for testing the performance regarding elapsed time and accuracy, which gives better results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call