Abstract

Coarse-grained response selection is a fundamental and essential subsystem for the widely used retrieval-based chatbots, aiming to recall a coarse-grained candidate set from a large-scale dataset. The dense retrieval technique has recently been proven very effective in building such a subsystem. However, dialogue dense retrieval models face two problems in real scenarios: (1) the multi-turn dialogue history is re-computed in each turn, leading to inefficient inference; (2) the index storage of the offline index is enormous, significantly increasing the deployment cost. To address these problems, we propose an efficient coarse-grained response selection subsystem consisting of two novel methods. Specifically, to address the first problem, we propose the H ierarchical D ense R etrieval. It caches rich multi-vector representations of the dialogue history and only encodes the latest user’s utterance, leading to better inference efficiency. Then, to address the second problem, we design the D eep S emantic H ashing to reduce the index storage while effectively saving its recall accuracy notably. Extensive experimental results prove the advantages of the two proposed methods over previous works. Specifically, with the limited performance loss, our proposed coarse-grained response selection model achieves over 5x FLOPs speedup and over 192x storage compression ratio. Moreover, our source codes have been publicly released. 1

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.