Abstract

Real-time emotion recognition in conversations (RTERC), the task of using the historical context to identify the emotion of a query utterance in a conversation, is important for opinion mining and building empathetic machines. Existing works mainly focus on obtaining each utterance representation separately and then utilizing utterance-level features to model the emotion representation of the query. These approaches treat each utterance as a unit and capture the utterance-level dependencies in the context, but ignore the word-level dependencies among different utterances. In this paper, we propose a multi-view network (MVN) to explore the emotion representation of a query from two different views, i.e., word- and utterance-level views. For the word-level view, MVN takes the context and query as word sequences and then models the word-level dependencies among utterances. For the utterance-level view, MVN extracts each utterance representation separately and then models the utterance-level dependencies in the context. Experimental results on two public emotion conversation datasets show that the proposed model outperforms the state-of-the-art baselines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call