Feature evaluation for unsupervised bioacoustic signal segmentation of anuran calls

Juan G Colonna,Eduardo F Nakamura,Osvaldo A Rosso

doi:10.1016/j.eswa.2018.03.062

Abstract

We present a comprehensive study of temporal Low-Level acoustic Descriptors (LLDs) to automatically segment anuran calls in audio streams. The acoustic segmentation, or syllable extraction, is a key task shared by most of the bioacoustical species recognition systems. Consequently, the syllable extraction has a direct impact on the classification rate. In this work, we assess several new entropy measures including the recently developed Permutation Entropy, Weighted Permutation Entropy, and Permutation Min-Entropy, and compare them to the classical Energy, Zero Crossing Rate and Spectral Entropy. In addition, we propose an algorithm to estimate the optimal segmentation threshold value used to separate deterministic segments from stochastic ones avoiding the creation of thin clusters. To assess the performance of our segmentation approach, we applied a frame-by-frame, a point-to-point and an event-to-event comparisons. We show that in a scenario with severe noise conditions (SNR ≤ 0dB), simple entropy descriptors are robust, achieving 97% of segmentation performance, while keeping a low computational cost. We conclude that there is no LLD that is suitable for all scenarios, and we must adopt multiple or different LLDs, depending on the expected noise conditions.

Full Text