Abstract

Distinguishing between a dangerous audio event like a gun firing and other non-life-threatening events, such as a plastic bag bursting, can mean the difference between life and death and, therefore, the necessary and unnecessary deployment of public safety personnel. Sounds generated by plastic bag explosions are often confused with real gunshot sounds, by either humans or computer algorithms. As a case study, the research reported in this paper offers insight into sounds of plastic bag explosions and gunshots. An experimental study in this research reveals that a deep learning-based classification model trained with a popular urban sound dataset containing gunshot sounds cannot distinguish plastic bag pop sounds from gunshot sounds. This study further shows that the same deep learning model, if trained with a dataset containing plastic pop sounds, can effectively detect the non-life-threatening sounds. For this purpose, first, a collection of plastic bag-popping sounds was recorded in different environments with varying parameters, such as plastic bag size and distance from the recording microphones. The audio clips’ duration ranged from 400 ms to 600 ms. This collection of data was then used, together with a gunshot sound dataset, to train a classification model based on a convolutional neural network (CNN) to differentiate life-threatening gunshot events from non-life-threatening plastic bag explosion events. A comparison between two feature extraction methods, the Mel-frequency cepstral coefficients (MFCC) and Mel-spectrograms, was also done. Experimental studies conducted in this research show that once the plastic bag pop sounds are injected into model training, the CNN classification model performs well in distinguishing actual gunshot sounds from plastic bag sounds.

Highlights

  • As cities become more densely populated, the volume of criminal activities follows suit [1,2]

  • We develop a simple binary classification model based on a convolutional neural network (CNN), which served as the baseline for the purpose of assessing the performance of a gunshot detection model

  • To check on the which sound classification model could be confused misclassification indicates that itextent is very of difficult for a a classification model to discern by fakesounds gunshots, trained thesounds model without letting it beThis exposed with plastic bag gunshot-like such aswe plastic bag pop from real gunshot sounds

Read more

Summary

Introduction

As cities become more densely populated, the volume of criminal activities follows suit [1,2]. It has become a sine qua non measure that false detections generated by existing gunshot detection systems are reduced to a minimum. The ability of a model to correctly discern sounds, even in the subtlest of scenarios, will differentiate a well-trained model from one that is not very efficient. In addition to a high true positive rate (TPR), a good measure for judging a detection system is its false positive rate (FPR). Why Real Sounds Matter for Machine Learning. Available online: https://www.audioanalytic.com/why-real-sounds-matter-formachine-learning/ (accessed on 29 March 2021)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call