Abstract

Detecting Emotion from Spoken Words (SER) is the task of detecting the underlying emotion in spoken language. It is a challenging task, as emotions are subjective and highly contextual. Machine learning algorithms have been widely used for SER, and one such algorithm is the Gaussian Mixture Model (GMM) algorithm. The GMM algorithm is a statistical model that represents the probability distribution of a random variable as a sum of Gaussian distributions. It has been widely used for speech recognition and classification tasks. In this article, we offer a method for SER using Extreme Machine Learning (EML) with the GMM algorithm. EML is a type of machine learning that uses randomization to achieve high accuracy at a low computational cost. It has been effectively utilised in various classification tasks. For the planned approach includes two steps: feature extraction and emotion classification. Cepstral Coefficients of Melody Frequency (MFCCs) are used in order to extract features. MFCCs are commonly used for speech processing and represent the spectral envelope of the speech signal. The GMM algorithm is used for emotion classification. The input features are modelled as a mixture of Gaussians, and the emotion is classified based on the likelihood of the input features belonging to each Gaussian. Measurements were taken of the suggested method on the The Berlin Database of Emotional Speech (EMO-DB) and achieved an accuracy of 74.33%. In conclusion, the proposed approach to SER using EML and the GMM algorithm shows promising results. It is a computationally efficient and effective approach to SER and can be used in various applications, such as speech-based emotion detection for virtual assistants, call centre analytics, and emotional analysis in psychotherapy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call