This paper analyses the efficiency of various frequency cepstral coefficients (FCC) in a non-speech application, specifically in classifying acoustic impulse events-gunshots. There are various methods for such event identification available. The majority of these methods are based on time or frequency domain algorithms. However, both of these domains have their limitations and disadvantages. In this article, an FCC, combining the advantages of both frequency and time domains, is presented and analyzed. These originally speech features showed potential not only in speech-related applications but also in other acoustic applications. The comparison of the classification efficiency based on features obtained using four different FCC, namely mel-FCC (MFCC), inverse mel-frequency cepstral coefficients (IMFCC), linear-frequency cepstral coefficients (LFCC), and gammatone-frequency cepstral coefficients (GTCC) is presented. An optimal frame length for an FCC calculation is also explored. Various gunshots from short guns and rifle guns of different calibers and multiple acoustic impulse events, similar to the gunshots, to represent false alarms are used. More than 600 acoustic events records have been acquired and used for training and validation of two designed classifiers, support vector machine, and neural network. Accuracy, recall and Matthew’s correlation coefficient measure the classification success rate. The results reveal the superiority of GFCC to other analyzed methods.
Read full abstract