Abstract
The aim of surveillance is to detect the occurrence of dangerous events. Recently, with the widely use of deep learning, video surveillance had get dramatically improvement. For audio event detection in surveillance, the deep learning means are applied in hazardous sound classification task. However, due to the low frequency of dangerous sounds occurred and the high cost of collection, there is no corresponding large-scale dataset. Large-scale dataset is essential to achieve an ideal result for deep learning methods. Therefore, how to obtain richer audio events has become an urgent problem. Nowadays, researchers have use a variety of data augmentation methods in computer vision, making performance improvement obviously. And these approaches are gradually being used in various sound pattern recognition or ASR (auto-speech recognition), but there is little research on the classification of hazardous sounds with less data set. In this paper, various data augmentation methods are adopted for hazardous sound classification. Our results show that data augmentation has bring big improvement on all four class dataset. The classification accuracy has increased by 0.5% on average. As the scale of data augmentation increases, the classification accuracy has increased to about 1.5%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.