Abstract
For the conservation of avian biodiversity, bird detection is vital since it allows ornithologists to quantify which species exist in a particular area. Analyzing their acoustic signals enables the efficient identification of multiple bird species from overlapping recordings. This paper addresses classifying bird vocalizations in real-time audio recording using acoustic analysis. Schemes based on recurrent neural networks (RNN) are presented in the proposed work. Gated-recurrent units (GRU) are a particular type of RNN that has shown remarkable performance in acoustic classification. We propose a hierarchical Attention-based bidirectional gated recurrent unit (BiGRU) model for classifying acoustic signals of birds by using Mel-frequency cepstral coefficients (MFCC). The attention mechanism has proved its superior efficacy in many acoustic, speech and music processing applications. The attention mechanism is employed to give a different focus to the information outputted from the hidden layers of BiGRU. We adopted a short-time sliding-aggregation approach to decide on the test data, in which probability outcomes are species-wise summed and normalized. Species with the highest probability scores are assumed to be the dominant species in the recording. Our Attention-BiGRU classifier achieves pretty high performance in the Xeno-Canto dataset, with an F1-score of 0.84, with competing performance to the state-of-the-art multi-label classifiers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.