AbstractEarthquake catalogs are essential to analyze the evolution of active fault systems. The background seismicity rate, or rate of earthquakes that are not directly triggered by other earthquakes, directly relates to the stressing rate, a crucial quantity for understanding the seismic hazards. Determining the background seismicity rate is challenging because aftershock sequences may dominate the seismicity rate. Classifying these events in earthquake catalogs—known as catalog declustering—is a common practice and most declustering solutions rely on spatiotemporal distances between events, such as the nearest‐neighbor‐distance algorithm, widely used in various contexts. This algorithm assumes that the nearest‐neighbor distance (NND) follows a bimodal distribution related to the background seismicity and to the aftershocks. Constraining these two distributions is crucial to distinguish the aftershocks from the background events accurately. Recent work often uses linear splitting based on the NND, ignoring the potential overlap between the two populations and resulting in a biased identification of background earthquakes and aftershock sequences. We revisit this problem with machine‐learning algorithms. After testing several popular algorithms, we show that a random forest trained with various synthetic catalogs generated by an Epidemic‐Type Aftershock Sequence model outperforms approaches such as k‐means, Gaussian‐mixture models, and support vector machine classification. We apply our model to two different earthquake catalogs: the relocated Southern California earthquake center catalog and the GeoNet catalog of New Zealand. Our model capably adapts to these two different tectonic contexts, highlighting the differences in aftershock productivity between crustal and intermediate‐depth seismicity.
Read full abstract