In the development of intelligent rock burst prediction models, issues such as incomplete data coverage and data imbalance are frequently encountered. These issues may lead to risks of overfitting in predictive models, poor generalization capabilities, and increased bias, which in turn may result in misjudgments and unpredictable losses. To accurately predict rock burst disasters and mitigate or eliminate related threats, this paper proposes a composite prediction model that integrates Density-Based Nonlinear Resampling (DBNR)-Tomek Link data balancing algorithms with Bayesian Optimization (BO)-Multilayer Perceptron (MLP)-Random Forest (RF). Initially, this study collected and organized a total of 301 recorded rock burst disaster field observation data, covering various tectonic plates, engineering types, rock origins, rock textures, and rock burst types. Subsequently, from a data analysis perspective, we employed the PCA-SSA-K-means unsupervised clustering algorithm to delve into the underlying information contained within the data, thereby validating the rationality of categorizing rock bursts into four grades. Then, using the L2 norm to optimize the dimensionality of the indicators and supplementing with indicator importance ranking and hypothesis testing, we selected the maximum tangential stress of the surrounding rock, the ratio of the maximum tangential stress of the surrounding rock to the uniaxial compressive strength of the rock (stress coefficient), and the elastic energy index as the criteria for rock burst intensity grading. Following that, the DBNR-Tomek Link sampling method was applied to balance the sample data, optimizing the data sample ratio and ultimately expanding the sample size to 396, improving the proportion of data samples from 2:3:4:1 to 1:1:2:1, thereby enhancing the model’s generalization performance. Ultimately, a BO-MLP-RF composite prediction model was constructed based on Bayesian Optimization (BO), Multilayer Perceptron (MLP), and Random Forest (RF) algorithms, with the Bayesian Optimization method ensuring that the model fits the training data well and generalizes to the test data. The results of tenfold cross-validation demonstrated that the model’s accuracy is consistently around 92.5%, combining the training results of rock burst models with imbalanced datasets, proving that the MLP model, adept at modeling nonlinear data, and the RF model, skilled in modeling large-scale data, serve as basic classifiers. This demonstrates that the application of data balancing and combined discriminative model schemes have enhanced the model’s predictive performance and stability. The model is capable of providing high-accuracy, high-efficiency early warning monitoring services for rock burst phenomena in rock engineering, thereby ensuring engineering safety.
Read full abstract