Abstract
This study proposes a supervised feature selection technique for classification in high dimensional binary class problems by adding robustness in the conventional Fisher Score. The proposed method utilizes the more robust measure of location, i.e. the median and measure of dispersion known as Rousseeuw and Croux statistic (Qn). Initially, a minimum subset of genes is identified by the greedy search approach, which is then combined with the top ranked genes obtained via the proposed Robust Fisher Score (RFish). To remove redundancy in the selected genes, Least Absolute Shrinkage and Selection Operator (LASSO) is then applied. The proposed method is validated on five publicly available datasets and is further assessed in a detailed simulation study. The results of the proposed method are compared with six well known feature selection methods based on prediction performance via Random Forest (RF), Support Vector Machine (SVM) and k Nearest Neighbour (k-NN) classifiers. The findings are presented in boxplots and barplots, which show that the proposed method (RFish) outperforms all the other methods in the majority of cases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.