Accurate prediction of reservoir landslide displacements is crucial for early warning and hazard prevention. Current machine learning (ML) paradigms for predicting landslide displacement demonstrate superior performance, while often relying on various feature engineering techniques, such as decomposing into different temporal lags and feature selection. This study investigates the impact of various feature selection techniques on the performance of ML algorithms for landslide displacement prediction. The Shuping and Baishuihe landslides in China’s Three Gorges Reservoir Area are used to comprehensively benchmark four prevalent ML algorithms. Both static ML models, including backpropagation neural network (BPNN), support vector machine (SVM), and dynamic models, such as long short-term memory (LSTM), and gated recurrent unit (GRU), are included. Each ML model is evaluated under three feature engineering techniques: raw multivariate time series, and feature selection under maximal information coefficient-partial autocorrelation function (MIC-PACF), or grey relational analysis-PACF (GRA-PACF). The results demonstrate that appropriate feature selection methods could significantly improve the performance of static ML models. In contrast, dynamic models effectively leverage inherent capabilities in capturing temporal dynamics within raw multivariate time series, seeing marginal gains with extensive feature engineering compared to no feature selection strategy. The optimal feature selection approach varies based on the ML model and specific landslide, highlighting the importance of case-specific assessments. The findings in this study offer guidance on integrating feature selection techniques with different machine learning models to maximize the robustness and generalizability of data-driven landslide displacement prediction frameworks.
Read full abstract