Abstract

ABSTRACT This work aims to improve the feature selection for data-driven rainfall–runoff models by assessing the significance of each input variable in the learning process and analysing it from a physical point of view. For this purpose, a set of 14 experiments was carried out in two watersheds of the Santa Lucía Chico basin, Uruguay. A random forest model was trained and tested for daily discharge prediction in each of them using different input variables. A feature importance analysis was carried out for each model, using a non-model-biased method (Shapely additive explanations). Results showed that the most relevant variables were lagged discharges of one and two days, along with seven-day accumulated rainfall, which is interpreted as a proxy of the soil moisture condition of the watershed. The temperature was also relevant and was proven to represent the effect of the whole set of climatic variables (relative humidity, solar radiation, wind speed).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call