Abstract
This paper presents a two-stage feature selection scheme using machine learning techniques. In the first stage a wrapper method is adopted to select various combinations of subsets of features from the original dataset. The performance of the model is evaluated by three classifiers: K-Nearest Neighbor (KNN), Support Vector Machines (SVM), and Random Forest (RF). In the second and final stage, a sequential backward feature selection Method is applied. The proposed method is demonstrated on eighteen datasets and the average classification accuracy of eighteen datasets achieved is 89.81%, 87.55%, and 89.82% using the KNN, SVM, and RF classifiers, respectively with a maximum reduced size of the subset being ten only. Comparing the proposed method to eight other feature selection methods, the former achieves better classification accuracy in terms of selecting the most useful but a smaller number of features.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Data Warehousing and Mining
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.