Feature Selection Techniques for Wood Density Prediction in Forest Dataset

Et Al Jeyabharathy Sadaiyandi

doi:10.17762/ijritcc.v11i9.9169

Abstract

Feature selection becomes important, especially in data sets with a large number of variables and features. It will eliminate unimportant variables and increase classification accuracy and performance. Because of the increasing growth of data in numerous industries, some data is high-dimensional and contains essential and complex hidden linkages, posing new problems to feature selection: i) How to extract the underlying available relationships from the data, and ii) How to apply the learnt relations to improve feature selection? To address these issues, we use the six feature selection approach as a pre-processing step in the analysis to avoid over fitting and potential model under performance. Which can learn and apply the underlying sample relations and feature relations for feature selection. This study compared six feature selection approaches (Pearson Coefficient, Correlation matrix, Variable Importance, Forward selection, and Backward Elimination) for determining the decomposition level of forest trees. Our trials clearly provide a comparative evaluation of the Wrapper approach from several angles. Furthermore, we compare the dataset result with critical attributes to obtain the highest percentage accuracy. The experimental results show that the wrapper technique outperforms all other methods in all experiment groups.

Full Text