Abstract
Feature selection is an important method for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the corpus.Extensive researches have been done to improve the performance of individual feature selection methods, but not much on their combinations.In this paper, we propose a method of combining multiple feature selection methods by using the combinatorial fusion analysis (CFA). A rank-score function and its graph, called rank-score graph,are adopted to measure the diversity of different feature selection methods.We have shown that a combination of multiple feature selection methods can outperform a single method only if each individual feature selection method has unique scoring behavior and relatively high performance. Moreover, it is shown that the rank-score function and rank-score graph are useful for the selection of a combination of feature selection methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have