A spontaneous mental state, emotion does not result from deliberate effort. There are many different kinds of emotions in a speech. Because it enhances interactions between people and technology, automatic emotion identification from human speech is becoming more common today. Several temporal and spectral features of human speech can be extracted. Several methods can be used to categorise pitch-related traits, Mel Frequency Cepstral Coefficients (MFCCs), and speech formants. This study looks at statistical characteristics, including MFCCs and linear discriminant analysis, which were used to categorise these properties (LDA). This article also describes a database of artificially emotionalized Marathi speech. The data samples were collected from Marathi speeches given by men and women that mimicked the emotions that resulted to Marathi utterances that could be utilised in everyday conversation and are interpreted in all analysed emotions. To identify data samples, three essential categories—happy, sad, and angry—were used. For MFCC and LPC, the training accuracy and testing accuracy are 98, 82 and 85,82 respectively.
Read full abstract