In studies of gravelly soil liquefaction discrimination, the number of training samples significantly affects model performance. However, historical earthquakes provide limited instances of gravelly soil liquefaction, resulting in no guarantee of the generalization performance of existing models. Additionally, valuable data from liquefiable critical layer points and unlabeled points outside the critical layer are often overlooked in stratigraphic exploration, despite their high value. Therefore, in this paper, a data extension method for semi-supervised learning based on the Markov chain Monte Carlo–Bayesian network (MCMC-BN) method is proposed and a semi-MCMC-BN gravelly soil liquefaction discrimination model is constructed. By effectively mining the available site liquefaction information, the proposed method increases the sample size of the gravelly soil liquefaction dataset by approximately 4 times, expanding 120 initial liquefaction samples to 476. The results demonstrate that the extended liquefaction database improves the distribution of liquefaction data for gravelly soils. The semi-supervised learning model outperforms the supervised learning model, highlighting the feasibility of the semi-supervised approach for gravelly soil liquefaction discrimination. In addition, debris flow data for the Wenchuan area, China, are used to validate the effectiveness of the proposed method. Finally, the impacts of the confidence level magnitude, critical soil layer selection, and different machine learning methods on the data extension outcomes are discussed. The expanded database can provide data support to studying gravelly soil liquefaction, and the proposed data expansion framework can be used for related research in other fields.
Read full abstract