Abstract

Dialogue systems, one of the core research fields of natural language processing, attempt to understand the utterances of a user and generate an appropriate response. Response selection in a retrieval-based dialogue system involves searching for the most context-appropriate subsequent utterance. Conversations are usually composed of multiple turns; therefore, the intention of the speaker must be properly understood prior to the response selection. To accurately capture such intended meaning, we propose a retrieval-based response selection model that effectively comprehends the relationships among words and utterances in a conversation and a response candidate with word and utterance attention. Word representation is generated by using the self-attention mechanism to reflect the contextual information between intentional words in an overall conversation or individual utterance, while utterance representation is by the cross-attention mechanism to reflect the contextual information among utterances. Furthermore, since our model does not need much computation and memory size, it can be easily combined with existing other response selection models or pre-trained language models. Experiments on various utterance embedding methods were also conducted to find a proper representation of the utterance information. Our proposed model exhibits an improvement in performance of approximately 2.1%p in hit@1 in the DSTC8 ubuntu dataset compared to baseline models, as well as significant performance improvements for other datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call