A review of automatic recognition technology for bird vocalizations in the deep learning era

Jiangjian Xie,Yujie Zhong,Junguo Zhang,Shuo Liu,Changqing Ding,Andreas Triantafyllopoulos

doi:10.1016/j.ecoinf.2022.101927

Abstract

Birds are considered critical indicators of ecosystem condition. Automatic recording devices have emerged as a trending tool to assist field observations, contributing to biodiversity monitoring on large spatio-temporal scales. However, manually processing huge volumes of recordings is challenging. Consequently, there has been a growing interest in automatic bird vocalization recognition in recent years. Automatic bird vocalization recognition technology has advanced from classical pattern recognition to deep learning (DL), with significantly improved recognition performance. This paper reviews related works on DL-based automatic bird vocalization recognition technology in the last decade. In this review, we present the current state of research in the three key areas of pre-processing, feature extraction and recognition methods involved in automatic bird vocalization recognition. The related datasets, evaluation metrics and software are also summarized. Finally, existing challenges along with opportunities for future work are highlighted. We conclude that, while DL-based automatic bird vocalization recognition has made recent advances in specific species, more robust denoising approaches, larger public datasets, and stronger generalization capabilities of feature extraction and recognition are required to achieve reliable and general bird recognition in the wild. We expect that this review will serve as a firm foundation for new researchers working in the field of DL-based automatic bird vocalization recognition technologies, as well as become an insightful guide for computer science and ecology experts.

Full Text