Abstract

• Reading grid of six properties for regression filter criteria analysis. • Analysis of a nonlinear coefficient of determination criterion for feature selection. • Comparison of three relevance criteria for regression problems. Feature selection is an important preprocessing step in machine learning. It helps to better understand the importance of some features and to reduce the dimensionality of a dataset, which improves machine learning and information extraction. Among the different existing methods for selecting features, filters are popular because they are independent from the model, which will be learnt afterwards, and computationally efficient. The efficiency of filter methods relies on a strategic choice: the choice of the relevance criterion. Many criteria exist; they exhibit various properties, which in turn result in selecting different features. The choice of the criterion is thus important and should ideally be linked to the properties of the data and to users’ goals. This paper shows that six properties should be analysed when selecting a relevance criterion in the context of regression problems. It proposes a reading grid to analyse relevance criteria and to make a well-guided choice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call