Convolutional Neural Networks for Scops Owl Sound Classification

Alam Ahmad Hidayat,Bens Pardamean,Tjeng Wawan Cenggoro

doi:10.1016/j.procs.2020.12.010

Alam Ahmad Hidayat, Bens Pardamean + Show 1 more

Open Access

https://doi.org/10.1016/j.procs.2020.12.010

Copy DOI

Journal: Procedia Computer Science	Publication Date: Jan 1, 2021
Citations: 34	License type: cc-by-nc-nd

Affiliation: Binus University

Abstract

Adopting a deep learning model into bird sound classification tasks becomes a common practice in order to construct a robust automated bird sound detection system. In this paper, we employ a four-layer Convolutional Neural Network (CNN) formulated to classify different species of Indonesia scops owls based on their vocal sounds. Two widely used representations of an acoustic signal: log-scaled mel-spectrogram and Mel Frequency Cepstral Coefficient (MFCC) are extracted from each sound file and fed into the network separately to compare the model performance with different inputs. A more complex CNN that can simultaneously process the two extracted acoustic representations is proposed to provide a direct comparison with the baseline model. The dual-input network is the well-performing model in our experiment that achieves 97.55% Mean Average Precision (MAP). Meanwhile, the baseline model achieves a MAP score of 94.36% for the mel-spectrogram input and 96.08% for the MFCC input.

Full Text