Abstract

This study aimed to investigate the performance of decision tree-based models for water quantity and quality prediction. The models adopted for performance assessment included decision tree (DT), random forest (RF), and extreme gradient boosting (XGB), which was fed by the data sets collected from two monitoring stations in the Nakdong River during 2018-2021. A 7:3 ratio was used to prepare training and testing sets for three prediction models and their hyperparmeters were tuned to improve the accuracy of prediction. We found that XGB which was not sensitive to input data resolution outperformed the other two models, DT and RF. In contrast, the prediction error for DT model decreased progressively in response to increasing monitoring frequency from 7 through 3 to 1 day as well as after applying post-pruning, regardless of dependent variables. When the accuracy of prediction for RF model was assessed as a function of the number of independent variables, more than 4 variables was effective in maintaining its prediction performance as compared to all variables adopted. Therefore, both monitoring frequency and pruning play an important role in reducing the prediction error of decision tree models, in addition to hyperparameter optimization.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.