Abstract
Abstract Sampling bias is an inevitable issue in seismological research, especially in collecting seismic site condition data due to operational constraints. It impacts data representativeness and subsequent model performance. Previous researches give insufficient consideration to this issue. Some researches worked on raw data directly. Some others used conventional declustering methods that rely on data spatial distribution and proved to be ineffective. This study investigates sampling bias in seismological research by employing a debiasing method that incorporates secondary variables, focusing on VS30 datasets from mainland China, Japan, Türkiye, and Taiwan. Quantifying analysis showed that, when considering topographic slope as the secondary variable, the sampling biases in the mainland China and Taiwan dataset are more pronounced. When examining the secondary variable of geology age, the sampling biases in the Japan and Türkiye datasets that are not readily discernible through visual inspection of spatial distributions become apparent. By investigating the sampling bias of the Türkiye and mainland China dataset using semivariogram as the secondary variable, this study reveals hidden bias within the Türkiye dataset, despite its well-distributed appearance. This finding further illustrates the limitations of relying solely on spatial distribution to detect sampling bias. In addition, the study examined the impact of sampling bias on resulting models. The topographic slope-based VS30 proxy models of debiased data in the four regions demonstrate significant effects of debiasing on modeling outcomes. Notably, the debiased models exhibit a homogenized trend comparing to original models in the low topographic slope range, indicating the possibility of a globally consistent relationship.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have