Indonesian Speech Emotion Recognition using Cross-Corpus Method with the Combination of MFCC and Teager Energy Features

Oscar Utomo Kumala,Amalia Zahra

doi:10.14569/ijacsa.2021.0120422

Abstract

Emotion recognition is one of the widely studied topics in speech technology. Emotions that come from speech can contain useful information for many purposes. The main aspects in speech emotion recognition are speech features, speech corpus, and machine learning algorithms as the classifier method. In this paper, cross-corpus method is used to conduct Indonesian Speech Emotion Recognition (SER) along with the combination of Mel Frequency Cepstral Coefficients (MFCC) and Teager Energy features. Using Support Vector Machine (SVM) as classifier, the experiment result shows that applying cross-corpus method by adding corpora from other languages to the training dataset improves the emotion classification accuracy by 4.16% on MFCC Statistics feature and 2.09% on Teager-MFCC Statistics feature.

Highlights

Nowadays we are experiencing a rapid growth on Information Technology (IT) sectors, especially in mobile devices area
We achieved the accuracy of 83.33% and 79.17% from testing using the Mel Frequency Cepstral Coefficients (MFCC) Statistics feature for the first and latter scenario, respectively, whereas using Teager-MFCC Statistics feature achieved the accuracy of 85.42% and 83.33% for such scenarios, respectively
We can see that applying cross-corpus method by adding corpora from other languages to the training dataset can improve the overall performance of the emotion recognition, including the Indonesian Speech Emotion Recognition (SER)

Summary

INTRODUCTION

Nowadays we are experiencing a rapid growth on Information Technology (IT) sectors, especially in mobile devices area. One simple application is the virtual assistant will compile a (song) playlist that is comforting the user if there is sad emotion recognized in the speech Because of this high potential of use, it is necessary to further analyze the emotion recognition process itself. The first main topic in this study is the use of cross-corpus method [7] for the Indonesian SER. There are three corpora: one German corpus and two English corpora Another main topic is the combination of two speech features, Mel Frequency Cepstral Coefficients (MFCC) features and Teager Energy features. The features will be combined with Teager Energy features [9] to hopefully achieve better result These speech features are extracted from the corpus and used along with their statistical values.

RELATED WORKS

EXPERIMENTS

Preparing Corpus

Extracting the Speech Features

Configuring Corpus for Training and Testing

Conducting Training and Testing

Testing Result for the Same Corpus

Testing Result for different Corpus

Analysis

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2021
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Indonesian Speech Emotion Recognition using Cross-Corpus Method with the Combination of MFCC and Teager Energy Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Non-intrusive objective speech quality assessment using a combination of MFCC, PLP and LSF features
Rajesh Kumar Dubey ... Arun Kumar
-
Rajesh Kumar Dubey, et. al.Rajesh Kumar Dubey ... Arun Kumar
01 Dec 2013
01 Dec 2013

Automatic Parkinson’s disease detection based on the combination of long-term acoustic features and Mel frequency cepstral coefficients (MFCC)
Sara Hawi ... Lola El Sahmarany
Biomedical signal processing and control | VOL. 78
Sara Hawi, et. al.Sara Hawi ... Lola El Sahmarany
23 Jul 2022
Biomedical signal processing and control | VOL. 78

Real-time prediction of upcoming respiratory events via machine learning using snoring sound signal.
Bochun Wang ... Wen Xu
Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine | VOL. 17
Bochun Wang, et. al.Bochun Wang ... Wen Xu
12 Apr 2021
Journal of clinical sleep medicine : JCSM : official publication of the American Academy of Sleep Medicine | VOL. 17

Non‐intrusive speech quality assessment using multi‐resolution auditory model features for degraded narrowband speech
Rajesh Kumar Dubey ... Arun Kumar
IET Signal Processing | VOL. 9
Rajesh Kumar Dubey, et. al.Rajesh Kumar Dubey ... Arun Kumar
01 Dec 2015
IET Signal Processing | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Indonesian Speech Emotion Recognition using Cross-Corpus Method with the Combination of MFCC and Teager Energy Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications