Abstract

Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. There are a few group feature selection methods that used ranking techniques. Ranking methods provide the relative importance of each group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the new feature by applying random forest and selecting those important features that have higher importance compared with other ones. In the end, different machine-learning algorithms were used to train and test the models. For the experiment, we compared the proposed with supervised group lasso (SGL) methods by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.