Abstract

Accurately predicting an owl species based on its sound can be helpful for owl conservation. To build an accurate model for owl sound classification, deep learning is currently the most preferred algorithm, due to its excellent performance for modeling audio data. However, deep learning is generally underperformed for a small dataset, which is the case for recognizing scops owl sound. To overcome the issue, we proposed a transfer learning strategy, which is common for computer vision tasks, that can alleviate overfitting in a deep learning model for the owl sound classification. In our approach, we propose a neural network architecture consisting of the backbone of a EfficientNet model pre-trained on the massive ImageNet database. The model takes the sound input that has been converted as two image representations: Spectrogram and Mel Frequency Cepstral Coefficients. Our strategy enables the use of a relatively small size of pre-trained image classification model, which is widely available, for transfer learning in owl sound classification. Deploying the lightweight model in an automatic sound classifier provides a fast and accurate tool for various owl conservation purposes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call