Abstract
Support vector machines (SVMs) have seen an increased usage in applications of acoustic event classification since its rise to popularity about two decades ago. However, in recent years, deep learning methods, such as deep neural networks (DNNs), have shown to outperform a number of classification methods in various pattern recognition problems. This work starts by comparing the classification performance of DNNs against SVMs with a number of feature representations which fall into two categories: cepstral features and time-frequency image features. Unlike related work, the classification performance of the two classifiers is also compared with feature vector combination and the training and evaluation times of the classifiers and features are also compared. The performance is evaluated on an audio surveillance database containing 10 sound classes, each class having multiple subclasses, with the addition of noise at various signal-to-noise ratios (SNRs). The experimental results shows that DNNs have a better overall classification performance than SVMs with both individual and combined features and the classification accuracy with DNNs is particularly better at low SNRs. The evaluation time of the DNN classifier was also determined to be the fastest but with a slow training time.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.