Comparing recurrent convolutional neural networks for large scale bird species classification

Gaurav Gupta,Juan Lavista Ferres,Ming Zhong,Shahrzad Gholami,Meghana Kshirsagar

doi:10.1038/s41598-021-96446-w

Gaurav Gupta, Juan Lavista Ferres + Show 3 more

Open Access

https://doi.org/10.1038/s41598-021-96446-w

Copy DOI

Abstract

We present a deep learning approach towards the large-scale prediction and analysis of bird acoustics from 100 different bird species. We use spectrograms constructed on bird audio recordings from the Cornell Bird Challenge (CBC)2020 dataset, which includes recordings of multiple and potentially overlapping bird vocalizations with background noise. Our experiments show that a hybrid modeling approach that involves a Convolutional Neural Network (CNN) for learning the representation for a slice of the spectrogram, and a Recurrent Neural Network (RNN) for the temporal component to combine across time-points leads to the most accurate model on this dataset. We show results on a spectrum of models ranging from stand-alone CNNs to hybrid models of various types obtained by combining CNNs with other CNNs or RNNs of the following types: Long Short-Term Memory (LSTM) networks, Gated Recurrent Units (GRU), and Legendre Memory Units (LMU). The best performing model achieves an average accuracy of 67% over the 100 different bird species, with the highest accuracy of 90% for the bird species, Red crossbill. We further analyze the learned representations visually and find them to be intuitive, where we find that related bird species are clustered close together. We present a novel way to empirically interpret the representations learned by the LMU-based hybrid model which shows how memory channel patterns change over time with the changes seen in the spectrograms.

Highlights

We present a deep learning approach towards the large-scale prediction and analysis of bird acoustics from 100 different bird species
We present a comprehensive study of hybrid deep learning models on a large bird acoustics dataset Cornell Bird Challenge (CBC)2020
Imagenet-based models have been successfully applied for sound classification through spectrograms, they work on individual images and do not capture the temporal dependencies across time-points

Summary

Introduction

We present a deep learning approach towards the large-scale prediction and analysis of bird acoustics from 100 different bird species. Before deep learning gained wide-spread popularity, prior work had focused on feature extraction from raw audio recordings, followed by some classification models, such as Hidden Markov M odel[9,10], Random F orest[11], and Support Vector Machines[12] While these methods demonstrated the successful use of machine learning approaches, their major limitation has been that most of the features need to be manually identified[13] by a domain expert in order to make patterns more visible for the learning algorithms to work. The methodology of using Convolutional Neural Networks (CNN) to classify the spectrograms or mel-spectrograms extracted from raw audio clips These works achieved great success and the deep learning models performed well with high classification accuracy to detect the presence or absence of calls from a particular species, or to classify calls from multiple species. Some commonly used data augmentation techniques for image classification, such as rotation and flipping, may not make intuitive sense when applying to spectrograms generated from the acoustics data

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Aug 24, 2021
Citations: 52	License type: open-access

R Discovery Prime

R Discovery Prime

Comparing recurrent convolutional neural networks for large scale bird species classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

An Investigation into the Detection of Human Scratching Activity Based on Deep Learning Models
Kevin Wang
-
Kevin WangKevin Wang
28 Apr 2023
28 Apr 2023

Performance Evaluation of Recurrent Neural Networks Applied to Indoor Camera Localization
Muhammad S Alam ... Farhan B Mohamed
International Journal of Emerging Technology and Advanced Engineering | VOL. 12
Muhammad S Alam, et. al.Muhammad S Alam ... Farhan B Mohamed
02 Aug 2022
International Journal of Emerging Technology and Advanced Engineering | VOL. 12

Comparison of Hybrid Recurrent Neural Networks for Univariate Time Series Forecasting
Anibal Flores ... Hugo Tito
-
Anibal Flores, et. al.Anibal Flores ... Hugo Tito
25 Aug 2020
25 Aug 2020

GRU-MF: A Novel Appliance Classification Method for Non-Intrusive Load Monitoring Data
Aji Gautama Putrada ... Mohamad Nurkamal Fauzan
-
Aji Gautama Putrada, et. al.Aji Gautama Putrada ... Mohamad Nurkamal Fauzan
03 Nov 2022
03 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comparing recurrent convolutional neural networks for large scale bird species classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports