Abstract

We present a new method based on the ROC (Receiver Operating Characteristic) curve to efficiently select a feature subset in classifying a high-dimensional microarray dataset with a limited number of observations. Our method has two steps: (1) selecting the most relevant features to the target label using the ROC curve and (2) iteratively eliminating a redundant feature using the ROC curves. The ROC curve is strongly related with a non-parametric hypothesis testing, which must be effective for a dataset with small numerical observations. Experiments with real datasets revealed the significant performance advantage of our method over two competing feature subset selection methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call