Abstract

Proper response selection is a crucial challenge in retrieval-based chatbots. The state-of-the-art methods match a response with the word sequence of a context, or match the response with each utterance in the context and then accumulate matching information. The former architecture could lose some important local matching information in utterance–response pairs and does not explicitly capture the relationships and dependencies among utterances. The latter architecture does not consider the important global matching information because there is no match between the response and the context at word level. Hence, the above methods have a problem, without considering the fact that matching a response with different levels of a context could match different information for multi-turn response selection. In this work, we propose a hierarchical matching network to match a response with the word and utterance level of a context. At word level, we concatenate the multi-turn context as a long word sequence and then adopt a text matching model to match the response with the word sequence which can capture important matching information at word level. At utterance level, we employ the identical text matching model to match the response with each utterance in the context to capture important matching information for each utterance–response pair and then accumulate the matching information by a recurrent neural network to model the relationships of utterances. At last, the hierarchical matching information is fused to get the final matching information. Experiments on two large-scale public multi-turn response selection datasets show that the proposed model significantly outperforms the state-of-the-art baseline models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call