Accurately identifying and labeling seismic events is essential for understanding the internal dynamics of volcanoes and predicting volcanic eruptions. However, the manual labeling of seismic events is a labor-intensive and resource-demanding process that requires the expertise of highly skilled vulcanologist professionals. Unsupervised learning approaches provide an alternative solution by automatically discovering hidden structures in the data, thereby enabling the characterization of seismic events without requiring manual labeling. We compare two widely recognized unsupervised learning algorithms, k-means and the Gaussian mixture model (GMM), which employ different assignment strategies (i.e., hard assignment for k-means and soft assignment for GMM), to learn a feature representation from an unlabeled dataset collected from the Cotopaxi volcano. The learned feature representations are then evaluated on a supervised task to classify LP and VT events using labeled datasets from the Cotopaxi and Llaima volcanoes. Additionally, we compare our approach to a baseline composed of hand-crafted features engineered by domain experts over the years. Our findings highlight the effectiveness of unsupervised techniques in learning a wide range of relevant features that exhibit a high correlation with the baseline. Furthermore, when assessed in a classification task, our unsupervised approach achieves a statistically similar performance compared to the baseline features. Notably, the GMM feature representation outperforms the baseline features in effectively handling VT and LP noisy signals from the BREF station at Cotopaxi volcano.
Read full abstract