Abstract

This paper presents a novel sound event detection (SED) system for rare events occurring in an open environment. Wavelet multiresolution analysis (MRA) is used to decompose the input audio clip of 30 seconds into five levels. Wavelet denoising is then applied on the third and fifth levels of MRA to filter out the background. Significant transitions, which may represent the onset of a rare event, are then estimated in these two levels by combining the peak-finding algorithm with the K-medoids clustering algorithm. The small portions of one-second duration, called 'chunks' are cropped from the input audio signal corresponding to the estimated locations of the significant transitions. Features from these chunks are extracted by the wavelet scattering network (WSN) and are given as input to a support vector machine (SVM) classifier, which classifies them. The proposed SED framework produces an error rate comparable to the SED systems based on convolutional neural network (CNN) architecture. Also, the proposed algorithm is computationally efficient and lightweight as compared to deep learning models, as it has no learnable parameter. It requires only a single epoch of training, which is 5, 10, 200, and 600 times lesser than the models based on CNNs and deep neural networks (DNNs), CNN with long short-term memory (LSTM) network, convolutional recurrent neural network (CRNN), and CNN respectively. The proposed model neither requires concatenation with previous frames for anomaly detection nor any additional training data creation needed for other comparative deep learning models. It needs to check almost 360 times fewer chunks for the presence of rare events than the other baseline systems used for comparison in this paper. All these characteristics make the proposed system suitable for real-time applications on resource-limited devices.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.