Abstract

Dialogue state tracking is one of the main components in task-oriented dialogue systems whose duty is to track the user’s goal during the conversation. Due to the diversity in natural languages and existing utterances, the user requests may include unknown values at different turns in these systems. However, predicting the actual values of the user requests is necessary for completing the intended task. In existing studies, these values are determined using span-based methods to predict a span in utterances or previous dialogues. However, the slots are not correctly filled when values are multi-word. In addition, in some scenarios, the slot values in a given turn may depend on previous dialogue states. However, due to the limitation of the input length of language models, it is impossible to access all the previous dialogue states. This study proposes a new approach that uses a span-tokenizer and adds the Bi-LSTM layer on top of the BERT model to predict the exact span of multi-word values. This approach uses parameters like user utterances, important dialogue histories, and all dialogue states as input to decrease the length of the sequences. The results show that this strategy has led to a 1.80% improvement in the joint-goal accuracy and 0.15% improvement in the slot accuracy metrics over the MultiWOZ 2.1 dataset compared to the SAVN model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call