Abstract
Dialogue state tracking (DST) plays a crucial role in task-oriented dialogue systems, as it interprets and tracks user intentions throughout the dialogue. The accuracy of DST directly impacts system efficiency and user experience. While existing generation-based DST models have shown promising results, effectively modelling long dialogue sequences remains a challenge. Furthermore, the greedy search decoding strategy employed by most previous works for speed is suboptimal because the probability of ground truth tokens is not always the highest. In this study, we propose a novel learning framework that addresses these issues by incorporating turn-level contrastive learning and a reranking module. Specifically, we adopt turn-level contrastive learning to progressively rectify the intermediate states of lengthy dialogues by contrasting data points at a finer granularity. Additionally, the integration of the reranking module empowers the model to make more reliable decisions at each time step by further considering alternative tokens with high probabilities. Experimental results demonstrate that our approach consistently enhances the model’s capability to handle long dialogue sequences and achieves superior performance over strong baselines on both the MultiWOZ 2.1 and MultiWOZ 2.2 datasets.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have