In the recent trends, the world has stepped into a multimedia era for enhancing business, recommendation systems, and information retrieval, etc. Multimedia data is highly rich in contents which express different human emotions. Several issues for emotion detection from multimedia images & videos have been addressed in this domain, but a very less effort has been applied for text data. The evaluation of deep learning has outperformed traditional techniques in sentiment analysis tasks. Inspired by the work done in the field of sentiment analysis, a deep learning based framework has been implemented on multimedia text data for the task of fine-grained emotion detection. The presented work introduces a new corpus which expresses different forms of emotions collected from a TV show’s transcript. A manual annotation of the corpus has been conducted with the help of English expert annotators. As an emotion detection framework, this paper proposes a sequence-based convolutional neural network(CNN) with word embedding to detect the emotions. An attention mechanism is applied in the proposed model which allows CNN to focus on the words that have more effect on the classification or the part of the features that should be attended more. The main aim of the work is to develop a framework such a way to generalize to newly collected data and help business to understand the customer’s mind and social media monitoring as it allows us to gain an overview of the wider public opinion behind certain topics. Experiments conducted on the dataset shows that the proposed framework correctly detects the emotions from the text with good precision and accuracy score.
Read full abstract