Abstract

In this paper, we present an improved implementation for an auditory-inspired FFT-based model which calculates a noise-robust FFT spectrum. Through the use of characteristic frequency (CF) values of the cochlear filters in an early auditory (EA) model for power spectrum selection, and the use of a pair of running averages for the implementation of self-normalization, the proposed FFT model allows more flexibility in the extraction of audio features. To evaluate the performance of the proposed FFT model, a speech/music/noise classification task is carried out wherein a decision tree learning algorithm (C4.5) is used as the classifier. Audio features used for classification include the mel-frequency cepstral coefficient (MFCC) features, a set of conventional spectral features, and spectral features calculated using the proposed FFT model. Compared to the conventional MFCC and spectral features, the spectral features based on the proposed FFT model show more robust performance in noisy test cases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call