Abstract

In this paper, a novel approach of supervised feature selection is proposed based on the principle of dense subgraph discovery. To exploit dense subgraph discovery for the purpose of feature selection, the dataset is initially mapped to an equivalent weighted graph notation by considering the set of all features as its vertex set and the mutual dependency between each pair of features as the weight of the corresponding edge. The proposed feature selection algorithm proceeds in a two-phase manner. In the first stage, a dense sub-graph is first discovered so that the features within it become maximally non-redundant among each other and the averaged class relevance as well as averaged standard deviation of all these features are obtained as maximal as possible. In this regard, a novel induced degree is also defined for each feature by incorporating the aforesaid three important objectives of feature selection. In this phase, a modified version of an existing approximation algorithm is also used to find dense subgraph module. Finally, in the second stage, a floating forward–backward search is performed on the dense subgraph so obtained to reveal a better feature subset. In both stages, an existing version of the normalized mutual information score is employed to compute both the class relevance and redundancy. The main contribution of this paper is proposing a feature selection strategy by which the reduced features have the characteristics like maximal average class relevance, minimal average pairwise redundancy, and good discriminating power. The experimental results demonstrate that the proposed approach is competent with several conventional as well as state-of-art algorithms of supervised feature selection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.