Abstract

There are many sounds all around us and our brain can easily and clearly identify them. Furthermore, our brain processes the received sound signals continuously and provides us with relevant environmental knowledge. Although not up to the level of accuracy of the brain, there are some smart devices which can extract necessary information from an audio signal with the help of different algorithms. Over the years, various models like Convolutional Neural Networks (CNNs), Artificial Neural Networks (ANNs), Region- Convolutional Neural Networks (R-CNNs), and numerous machine learning techniques have been employed for sound classification. These methods have shown impressive results in distinguishing spectra-temporal patterns and different sound categories. The novelty of our research lies in showing that the long-short term memory (LSTM) shows a better result in classification accuracy compared to CNN for many features used. Additionally, we've evaluated model accuracy using different techniques such as data augmentation and feature stacking. With our RNN model, we achieved a remarkable accuracy of 87%, setting a new benchmark in performance on the UrbanSound8k dataset. Our findings not only advance the field of sound classification but also underscore the potential of LSTM networks and the importance of innovative techniques such as data augmentation and feature stacking in improving the accuracy of sound recognition systems. Key Words: Sound Classification, Urbansound8k, Librosa, Spectrograms, deep learning, CNN, LSTM.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.