Abstract

This paper addresses bird song scene analysis focusing on location of birds and acoustic features of bird songs. Such a research area usually requires manual annotation related to positions and/or vocalization types of the target animals for a large amount of observed data. However, this manual annotation has two problems. One is that it is tough to annotate data observed in real environments because environmental noise exist and sound is reflected by trees and the ground, and also several birds at different locations may sing at the same time. The other is that it is inevitable that manual annotation produces inaccurate and inconsistent labels due to human errors and annotators' individual differences. For the first problem, we propose a Spatial-Cue-Based Probabilistic Model (SCBPM), which is a probabilistic model to estimate the maximum likelihood result for a bird song scene analysis by integrating sound source detection, localization, separation and identification based on spatial information of sound sources. For the second problem, we employ a semiautomatic annotation approach, in which a semi-supervised training method is deduced for SCBPM. This method decreases the amount of manual annotation. Preliminary experiments using recorded bird song data from the wild revealed that our system outperformed a conventional bird song scene analysis system by simply connecting sound source detection, localization, separation and identification in a cascade way in terms of identification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call