Abstract

Finding the reduced and the relevant subsets of the predictors is inevitable when it comes to predictive modelling. If the datasets involved are heterogeneous and heteroscedastic in nature such as the soil samples, the task turns out to be trickier and such scenarios demand ensemble-based feature selection approach. The proposed algorithm uses an ensemble of filter (RReliefF), wrapper (Adaptive Plus –l and Minus –r), and embedded (Neighbourhood Component Analysis) approaches and applies it to the datasets in a heterogeneous and homogeneous manner. The Adaptive Plus –l and Minus –r is an experiment done on the Plus –l and Minus –r wrapper method to enhance the performance of the algorithm. The proposed combination rule of the ensemble filters out the irrelevant predictors for each response variable. Further, this ensemble is recursively implemented using the floating set of predictors to estimate the optimal subsets for multiple response variables at one go. Each step in recursion ensures that only the best subset of features (in terms of length and weights) among the current and previous iterations is retained. The Akaikes Information Criteria and the predictors weights calculated, assessed the efficiency of the resultant predictor sets of the recursive ensemble. The adjusted R2, Median Absolute Deviation and Root Mean Square Error on the unseen datasets confirmed the suitability of the same for predictive modelling in ecological domains.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call