Abstract

Mel Frequency Cepstral Coefficients (MFCC) are acoustic features, which are based on human auditory perception, focus on spectral properties and capture relevant features of the audio signal related to vocal tract shape and energy in different frequency bands. The methodology for feature extraction involves several stages. First, the audio signals go through a preprocessing stage in which they are initially normalized, then background noise is reduced, and finally, they pass through a median filter. In the next block of MFCC feature extraction, the first task was to split the audio signal into small frames. Then, a hamming window is applied to smooth the edges of each frame and the short time Fourier transform of each frame is calculated. Next, Mel filters are applied, which adjusts the representation of the frequency spectrum to human auditory perception. Finally, cepstral coefficients are calculated from the frequency spectrum. The MFCC coefficients are then used as input features for classifiers and machine learning algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call