An improved computational bioacoustic monitoring approach for detecting sparse features

Ben J Mcewen,Stefanie Gutschmidt,Richard Green,Isaac Cone,Andrew Bainbridge-Smith,Kaspar Soltero,James Atlas

doi:10.1121/10.0023062

Ben J Mcewen, Stefanie Gutschmidt + Show 5 more

https://doi.org/10.1121/10.0023062

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

The collection and annotation of bioacoustic data presents a number of challenges to researchers, often constraining analysis to highly vocal species. Computational tools allow monitoring to be extended to less vocal and more challenging species but data limitations remain an issue. We present a human-in-the-loop approach that combines the efficiency of computational tools with the accuracy of human analysis. We use a wavelet-based segmentation method that automatically extracts transient features within field recordings which can reduce data by up to 90% and requires as few as one reference feature. Segmented features are then used to fine-tune a transformer-based model, audio spectrogram transformer (AST), the output of which is verified by a human and the adjusted data fed back into the model to improve performance over time. We also present an outlier detection approach based on Mel-frequency Cepstral Coefficients. Coefficients are projected to 2-D and outliers are detected using silhouette score. This approach was able to achieve 98.8% validation accuracy on a binary classification task using a limited dataset of 200 5-min recordings with sparse features (occurrence rates of less than 1%). This approach makes real-time bioacoustic monitoring of less-vocal species a possibility.

Full Text