Abstract

Complexity of environmental sounds impose numerous challenges for their classification. The performance of Environmental Sound Classification (ESC) depends greatly on how good the feature extraction technique employed to extract generic and prototypical features from a sound is. The presence of silent and semantically irrelevant frames is ubiquitous during the classification of environmental sounds. To deal with such issues that persist in environmental sound classification, we introduce a novel attention-based deep model that supports focusing on semantically relevant frames. The proposed attention guided deep model efficiently learns spatio-temporal relationships that exist in the spectrogram of a signal. The efficacy of the proposed method is evaluated on two widely used Environmental Sound Classification datasets: ESC-10 and DCASE 2019 Task-1(A) datasets. The experiments performed and their results demonstrate that the proposed method yields comparable performance to state-of-the-art techniques. We obtained improvements of 11.50% and 19.50% in accuracy as compared to the accuracy of the baseline models of the ESC-10 and DCASE 2019 Task-1(A) datasets respectively. To support the attention outcomes that have focused on relevant regions, visual analysis of the attention feature map has also been presented. The resultant attention feature map conveys that the model focuses only on the spectrogram’s semantically relevant regions while skipping the irrelevant regions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.