Abstract

Dialog act recognition (DAR) and sentiment classification (SC) are two correlative tasks to capture speakers’ intentions, where dialog act and sentiment can indicate the explicit and the implicit intentions separately. Current state-of-the-art methods usually leverage the graph neural networks or attention mechanism to capture the contextual dependency of an utterance, thereby aiding in efficient identification of the speakers’ intentions. Although existing attention mechanisms can automatically obtain the corresponding coefficient of each utterance based on the utterance-level embeddings, they often ignore the reliability of the utterance-level embeddings. This promotes us to propose an ensemble model with two-stage learning for joint DAR and SC. Firstly, we employ the BiLSTM as the encoder to capture the features of utterances, which are referred to “confidence vector” of the utterance-level embedding. Further, we introduce an edge-aware graph attention network to improve the classifier’s performance by using the confidence vector to selectively leverage the contextual information. Experimental results on two benchmark datasets show that our framework obtains state-of-the-art performances against all baselines, demonstrating the effectiveness of our method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call