Abstract

Feature selection has been recognized for long as an important preprocessing technique to reduce dimensionality and improve the performance of regression and classification tasks. The class of sequential forward feature selection methods based on Mutual Information (MI) is widely used in practice, mainly due to its computational efficiency and independence from the specific classifier. A recent work introduced a theoretical framework for this class of methods which explains the existing proposals as approximations to an optimal target objective function. Such framework made clear the advantages and drawbacks of each proposal. Methods accounting for the redundancy of candidate features using a maximization function and considering the so-called complementary effect are among the best ones. However, they still penalize the complementarity, which is an important drawback. This paper proposes the Decomposed Mutual Information Maximization (DMIM) method, which keeps the good theoretical properties of the best methods proposed so far but overcomes the complementarity penalization by applying the maximization separately to the inter-feature and class-relevant redundancies. DMIM was extensively evaluated and compared with other methods, both theoretically and using two synthetic scenarios and 20 publicly available real datasets applied to specific classifiers. Our results show that DMIM achieves a better classification performance than the remaining forward feature selection methods based on MI.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call