Abstract

With the existing abundance of intelligent and expert systems, there is a need for selecting a subset of highly relevant features with low redundancy. In filter approaches, the feature subsets are iteratively computed by evaluating the candidate features in terms of their relevance with the target class and pairwise redundancies. The use mutual information-based metrics has been extensively studied as an approach to quantifying the relevance and redundancy of candidate features. In this study, a novel filter approach based on ranks of positive instances is proposed. In this approach, redundancy is replaced by diversity to quantify the complementarity of a candidate feature with respect to the already selected subset. Both relevance and diversity are computed in terms of the ranks of positive instances, which is analogous to the computation of the area under the receiver operating characteristic curve (AUC). Experiments conducted on 15 UCI and microarray gene expression data sets have confirmed that the proposed multivariate filter feature selection approach provides better performance scores when compared to other competing multivariate methods as well as benchmark univariate filters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.