Artificial Intelligent for Human Emotion Detection with the Mel-Frequency Cepstral Coefficient (MFCC)

Anita Ahmad Kasim,Muhammad Bakri,Rahmawati Rahmawati,Irwan Mahmudi,Zulnabil Zulnabil

doi:10.30595/juita.v11i1.15435

Anita Ahmad Kasim, Muhammad Bakri + Show 3 more

Open Access

https://doi.org/10.30595/juita.v11i1.15435

Copy DOI

Journal: JUITA : Jurnal Informatika	Publication Date: May 6, 2023
License type: CC BY 4.0

Affiliation: Tadulako University

Abstract

Emotions are an important aspect of human communication. Expression of human emotions can be identified through sound. The development of voice detection or speech recognition is a technology that has developed rapidly to help improve human-machine interaction. This study aims to classify emotions through the detection of human voices. One of the most frequently used methods for sound detection is the Mel-Frequency Cepstrum Coefficient (MFCC) where sound waves are converted into several types of representation. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. The primary data used in this research is the data recorded by the author. The secondary data used is data from the "Berlin Database of Emotional Speech" in the amount of 500 voice recording data. The use of MFCC can extract implied information from the human voice, especially to recognize the feelings experienced by humans when pronouncing the sound. In this study, the highest accuracy was obtained when training with epochs of 10000 times, which was 85% accuracy.

Full Text