Abstract

Feature interaction is a newly proposed feature relevance relationship, but the unintentional removal of interactive features can result in poor classification performance for this relationship. However, traditional feature selection algorithms mainly focus on detecting relevant and redundant features while interactive features are usually ignored. To deal with this problem, feature relevance, feature redundancy and feature interaction are redefined based on information theory. Then a new feature selection algorithm named CMIFSI (Conditional Mutual Information based Feature Selection considering Interaction) is proposed in this paper, which makes use of conditional mutual information to estimate feature redundancy and interaction, respectively. To verify the effectiveness of our algorithm, empirical experiments are conducted to compare it with other several representative feature selection algorithms. The results on both synthetic and benchmark datasets indicate that our algorithm achieves better results than other methods in most cases. Further, it highlights the necessity of dealing with feature interaction.

Highlights

  • In an era of growing data complexity and volume, high dimensional data brings a huge challenge for data processing, as it increases the computational complexity in computer engineering

  • Genetic Algorithm (GA), Symmetric Uncertainty (SU), Relief, Minimum-Redundancy Maximum-Relevance (MRMR), Conditional Mutual Information Maximization (CMIM), CMIFSI belonged to feature ranking and Correlation-based feature selection (CFS) was a kind of subset selection method

  • It can be seen that when comparing with other feature selection algorithms, CMIFSI achieves the best performance

Read more

Summary

Introduction

In an era of growing data complexity and volume, high dimensional data brings a huge challenge for data processing, as it increases the computational complexity in computer engineering. Excessive features bring high computation complexity, and cause the learning algorithm to over-fit the training data. Wrapper methods use a predetermined classifier to evaluate the candidate feature subset. They usually achieve a higher predictive accuracy than other methods, like some heuristic algorithms that excessively depend on hyper-parameters, with a heavy computational burden and a high risk of being overly specific to the classifier. Feature selection is integrated into the training process for a given learning algorithm. They are less computationally expensive, but need strict model structure assumptions. Filter methods are independent of learning algorithms because

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call