Abstract

Support vector machines (SVMs) have seen an increased usage in applications of acoustic event classification since its rise to popularity about two decades ago. However, in recent years, deep learning methods, such as deep neural networks (DNNs), have shown to outperform a number of classification methods in various pattern recognition problems. This work starts by comparing the classification performance of DNNs against SVMs with a number of feature representations which fall into two categories: cepstral features and time-frequency image features. Unlike related work, the classification performance of the two classifiers is also compared with feature vector combination and the training and evaluation times of the classifiers and features are also compared. The performance is evaluated on an audio surveillance database containing 10 sound classes, each class having multiple subclasses, with the addition of noise at various signal-to-noise ratios (SNRs). The experimental results shows that DNNs have a better overall classification performance than SVMs with both individual and combined features and the classification accuracy with DNNs is particularly better at low SNRs. The evaluation time of the DNN classifier was also determined to be the fastest but with a slow training time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call