Abstract

An optimal representation of acoustic features is an ongoing challenge in automatic speech emotion recognition research. In this study, we proposed Cepstral coefficients based on evolutionary filterbanks as emotional features. It is difficult to guarantee that an individual optimized filterbank provides the best representation for emotion classification. Consequently, we employed six HMM-based binary classifiers that used a specific filterbank, which was optimized by a genetic algorithm to categorize the data into seven emotion classes. These optimized classifiers were applied in a hierarchical manner and outperformed conventional Mel Frequency Cepstral Coefficients in terms of overall emotion classification accuracy. The proposed method using evolutionary-based Cepstral coefficients achieved a weighted average recall of 87.29% on the Berlin database while the same approach but using conventional Cepstral features achieved only 79.63%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call