Abstract

Grey theory is capable of representing uncertainty and has proved its applicability in prioritizing features for estimation problems and various decision-making problems. This study analyses the effectiveness of the application of grey theory in feature selection for daily dew point temperature (DPT) and daily pan-evaporation (PAN-EVP) estimation models. Feature subset identified by grey theory and subsets selected based on very high, high, medium, and low Pearson correlation coefficient (PCC) slabs are compared and analysed. Random Forest (RF) and Extreme gradient Boosting (XgBoost) are used for modelling. The performance of the models is evaluated using the root mean squared error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). The results showed that high PCC feature subset models underperformed on both datasets. The models with features selected using grey theory and medium PCC slab performed identically for both datasets. For the PAN-EVP dataset, the size of the grey theory-based subset is larger, both RF and XgBoost models with this scenario gave accuracy measures within the calculated average, unlike medium PCC slab subset. For the DPT dataset, the size of the grey theory-based subset is smaller and RF model with this scenario gave accuracy measures within the calculated average, unlike medium PCC slab subset. Grey theory and medium PCC slab subsets gave accuracy measures close to the calculated average for DPT estimation using XgBoost model. The study concludes that the models using grey theory-based feature selection demonstrated average or above-average performances and therefore is an effective feature selection technique.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.