Abstract

Feature selection and ranking is of great importance in the analysis of biomedical data. In addition to reducing the number of features used in classification or other machine learning tasks, it allows us to extract meaningful biological and medical information from a machine learning model. Most existing approaches in this domain do not directly model the fact that the relative importance of features can be different in different regions of the feature space. In this work, we present a context aware feature ranking algorithm called CAFÉ-Map. CAFÉ-Map is a locally linear feature ranking framework that allows recognition of important features in any given region of the feature space or for any individual example. This allows for simultaneous classification and feature ranking in an interpretable manner. We have benchmarked CAFÉ-Map on a number of toy and real world biomedical data sets. Our comparative study with a number of published methods shows that CAFÉ-Map achieves better accuracies on these data sets. The top ranking features obtained through CAFÉ-Map in a gene profiling study correlate very well with the importance of different genes reported in the literature. Furthermore, CAFÉ-Map provides a more in-depth analysis of feature ranking at the level of individual examples.Availability: CAFÉ-Map Python code is available at:http://faculty.pieas.edu.pk/fayyaz/software.html#cafemap .The CAFÉ-Map package supports parallelization and sparse data and provides example scripts for classification. This code can be used to reconstruct the results given in this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call