Acoustic recognition of multiple bird species based on penalised maximum likelihood

Peter Jancovic,Munevver Kokuer

doi:10.1109/lsp.2015.2409173

Abstract

Automatic system for recognition of multiple bird species in audio recordings is presented. Time-frequency segmentation of the acoustic scene is obtained by employing a sinusoidal detection algorithm, which does not require any estimate of noise and is able to handle multiple simultaneous bird vocalizations. Each segment is characterized as a sequence of frequencies over time, referred to as a frequency track. Each bird species is represented by a hidden Markov model that models the temporal evolution of frequency tracks. The decision on the number and identity of bird species in a given recording is obtained based on maximizing the overall likelihood of the set of detected segments, with a penalization applied for increasing the number of bird models used. Experimental evaluations are performed on audio field recordings containing 30 bird species. The presence of multiple bird species is simulated by joining the set of detected segments from several bird species. Results show that the proposed method can achieve recognition performance for multiple bird species not far from that obtained for single bird species, and considerably outperforms majority voting methods.

Full Text