Abstract

Abstract The Machine Learning Asset Aggregation of the Preliminary Determination of Epicenters (MLAAPDE) dataset is a labeled waveform archive designed to enable rapid development of machine learning (ML) models used in seismic monitoring operations. MLAAPDE consists of more than 5.1 million recordings of 120 s long three-component broadband waveform data (raw counts) for P, Pn, Pg, S, Sn, and Sg arrivals. The labeled catalog is collected from the U.S. Geological Survey National Earthquake Information Center’s (NEIC) Preliminary Determination of Epicenters bulletin, which includes local to teleseismic observations for earthquakes ∼M 2.5 and larger. Each arrival in the labeled dataset has been manually reviewed by NEIC staff. An accompanying Python module enables users to develop customized training datasets, which includes different time-series lengths, distance ranges, sampling rates, and/or phase lists. MLAAPDE is distinct from other publicly available datasets in containing local (14%), regional (36%), and teleseismic (50%) observations, in which local, regional, and teleseismic distance are 0°–3°, 3°–30°, and 30°+, respectively. A recent version of the dataset is publicly available (see Data and Resources), and user-specific versions can be generated locally with the accompanying software. MLAAPDE is an NEIC supported, curated, and periodically updated dataset that can contribute to seismological ML research and development.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.