Abstract

With the proliferation of smart home devices like Amazon Alexa and Google Home, automatic emotion detection from user commands and interactions with smart assistants, in other words smart commands, can enable personalized services for the inhabitants in a smart-home environment. In this paper, we compare different machine learning algorithms to identify a suitable classification technique for emotion detection. We also propose four new audio features named Chunk Gap Length, Mean Chunk Duration, Mean Word Duration Per Chunk and Per Chunk Word Count in addition to the existing Mel Frequency Cepstral Coefficient (MFCC) and Mel Spectrogram (MEL) features for emotion classification. We used the publicly available RAVDESS dataset for our initial experiment and then generated a custom dataset consisting of 5000 smart-home voice commands covering five emotional states - happy, normal, sad, fearful, and angry. Evaluation results show that combining our proposed features with MFCC and MEL provides better accuracy in classifying the correct emotions for individual users than MFCC and MEL only.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call