On investigating efficient methodology for Environmental Sound Recognition

Cruz Alfredo Ruiz-Martinez,Enrique Escamilla-Hernandez,Muhammad Tahir Akhtar,Yoshikazu Washizawa

doi:10.1109/ispacs.2013.6704548

Abstract

This paper presents a comparative study of various methods to identify the environmental sounds. We evaluate two methods for feature extraction: Mel Frequency Cepstral Coefficients (MFCC) which is well known for speaker identification, and Matching Pursuit (MP) with Gabor Dictionary which gives a time frequency representation employed for scene recognition. In the classification stage, we show a comparison among Support Vector Machines (SVM), Logistic Regression, and Backpropagation Artificial Neural Network (BP-ANN). Simulation results show that MFCC gives a higher recognition performance as compared with MP. Furthermore, by concatenating MFCC features with some feature of MP, e.g., scale, might also improve performance in some situations. We observe that SVM show the best performance among the classifiers, for clean as well noisy signals.

Full Text