Abstract The accurate prediction of maximum erosion depth in riverbeds is crucial for early protection of bank slopes. In this study, K-means clustering analysis was used for outlier identification and feature selection, resulting in Plan 1 with six influential features. Plan 2 included features selected by existing methods. Regression models were built using Support Vector Regression, Random Forest Regression (RF Regression), and eXtreme Gradient Boosting on sample data from Plan 1 and Plan 2. To enhance accuracy, a Stacking method with a feed-forward neural network was introduced as the meta-learner. Model performance was evaluated using root mean squared error, mean absolute error, mean absolute percentage error, and R2 coefficients. The results demonstrate that the performance of the three models in Plan 1 outperformed that of Plan 2, with improvements in R2 values of 0.0025, 0.0423, and 0.0205, respectively. Among the three regression models in Plan 1, RF Regression performs the best with an R2 value of 0.9149 but still lower than the 0.9389 achieved by the Stacking fusion model. Compared to the existing formulas, the Stacking model exhibits superior predictive performance. This study verifies the effectiveness of combining clustering analysis, feature selection, and the Stacking method in predicting maximum scour depth in bends, providing a novel approach for bank protection design.
Read full abstract