Blind source separation‐based IVA‐Xception model for bird sound recognition in complex acoustic environments

Yusheng Dai,Bin Wang,Yiwei Dong,Mingzhi Hu,Haipeng Zou,Jin Yang

doi:10.1049/ell2.12160

Yusheng Dai, Bin Wang + Show 4 more

Open Access

https://doi.org/10.1049/ell2.12160

Copy DOI

Abstract

Identification of bird species from audio recordings has been a major area of interest within the field of ecological surveillance and biodiversity conservation. Previous studies have successfully identified bird species from given recordings. However, most of these studies are only adaptive to low-noise acoustic environments and the cases where each recording contains only one bird's sound simultaneously. In reality, bird audios recorded in the wild often contain overlapping signals, such as bird dawn chorus, which makes audio feature extraction and accurate classification extremely difficult. This study is the first to focus on applying a blind source separation method to identify all foreground bird species contained in overlapping vocalization recordings. The proposed IVA-Xception model is based on independent vector analysis and convolutional neural network. Experiments on 2020 Bird Sound Recognition in Complex Acoustic Environments competition (BirdCLEF2020) dataset show that this model could achieve a higher macro F1-score and average accuracy compared with state-of-the-art methods.

Full Text