Relation-aware Graph Attention Networks with Relational Position Encodings for Emotion Recognition in Conversations

Taichi Ishiwatari,Yuki Yasuda,Taro Miyazaki,Jun Goto

doi:10.18653/v1/2020.emnlp-main.597

Abstract

Interest in emotion recognition in conversations (ERC) has been increasing in various fields, because it can be used to analyze user behaviors and detect fake news. Many recent ERC methods use graph-based neural networks to take the relationships between the utterances of the speakers into account. In particular, the state-of-the-art method considers self- and inter-speaker dependencies in conversations by using relational graph attention networks (RGAT). However, graph-based neural networks do not take sequential information into account. In this paper, we propose relational position encodings that provide RGAT with sequential information reflecting the relational graph structure. Accordingly, our RGAT model can capture both the speaker dependency and the sequential information. Experiments on four ERC datasets show that our model is beneficial to recognizing emotions expressed in conversations. In addition, our approach empirically outperforms the state-of-the-art on all of the benchmark datasets.

Highlights

Interest in emotion recognition in conversations (ERC) has been increasing in various fields (Picard, 2010), because it can be used to analyze user behaviors (Lee and Hong, 2016) and detect fake news (Guo et al, 2019)
Experiments on four ERC benchmark datasets showed that our relational position encoding outperformed baselines and state-of-the-art methods
(2) We propose relational position encodings for the relational graph structure to reflect both sequential information contained in utterances and speaker dependency in conversations

Summary

Introduction

Interest in emotion recognition in conversations (ERC) has been increasing in various fields (Picard, 2010), because it can be used to analyze user behaviors (Lee and Hong, 2016) and detect fake news (Guo et al, 2019). Recent research on ERC processes the utterances of dialogues in sequence by using recurrent neural network (RNN)-based methods (Hochreiter and Schmidhuber, 1997; Chung et al, 2014; Liu et al, 2016). These methods are not Speaker Utterance Emotion A. Sad. Well have you been trying to get a job, B. I’ve been looking for like eight months. Components Contextual Dependency Embedding Modeling Happy Sad. Neutral Angry Excited Frustrated Average BERT × CNN, GRU Ours

Methods

Results

Conclusion