Abstract

Emotions are an important aspect of human communication. Expression of human emotions can be identified through sound. The development of voice detection or speech recognition is a technology that has developed rapidly to help improve human-machine interaction. This study aims to classify emotions through the detection of human voices. One of the most frequently used methods for sound detection is the Mel-Frequency Cepstrum Coefficient (MFCC) where sound waves are converted into several types of representation. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. The primary data used in this research is the data recorded by the author. The secondary data used is data from the "Berlin Database of Emotional Speech" in the amount of 500 voice recording data. The use of MFCC can extract implied information from the human voice, especially to recognize the feelings experienced by humans when pronouncing the sound. In this study, the highest accuracy was obtained when training with epochs of 10000 times, which was 85% accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call