Abstract

Feature selection has been studied widely in the literature. However, the efficacy of the selection criteria for low sample size applications is neglected in most cases. Most of the existing feature selection criteria are based on the sample similarity. However, the distance measures become insignificant for high dimensional low sample size (HDLSS) data. Moreover, the variance of a feature with a few samples is pointless unless it represents the data distribution efficiently. Instead of looking at the samples in groups, we evaluate their efficiency based on pairwise fashion. In our investigation, we noticed that considering a pair of samples at a time and selecting the features that bring them closer or put them far away is a better choice for feature selection. Experimental results on benchmark data sets demonstrate the effectiveness of the proposed method with low sample size, which outperforms many other state-of-the-art feature selection methods.

Highlights

  • In this age of information, high dimension data with low sample size are very common in various areas of science [1]

  • We propose a naive way of selecting features which involves the combinational feature selection followed by the heuristic approach of score assignment to each feature

  • The proposed pair-wise feature proximity (PWFP) based feature selection method is compared with other literature and evaluated extensively

Read more

Summary

INTRODUCTION

In this age of information, high dimension data with low sample size are very common in various areas of science [1]. Feature selection is a process of selecting an optimal subset of features from the input feature set based on a selection criterion [3] It reduces the data dimensionality by removing redundant features and improves the time and space complexity of the data. Different criteria functions have been proposed in the literature to evaluate the goodness of features, such as mutual information (MutInf) [5], Fisher score (FS) [6], feature selection via concave minimization (FSCM) [7], ReliefF [8], Laplacian score (LS) [9], trace ratio criterion (TRC) [10], spectral feature selection (SPEC) [11], infinite feature selection (IFS) [12] etc They have demonstrated excellent performance in real-world applications. The proposed pair-wise feature proximity (PWFP) based feature selection method is compared with other literature and evaluated extensively

Notation
PROPOSED METHOD
11: Select the leading m features
EXPERIMENTS AND RESULTS
Hyperspectral data sets
Face Recognition
Other data sets
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call