Abstract

Natural language understanding is a critical module in task-oriented dialogue systems. Recently, state-of-the-art approaches use deep learning methods and transformers to improve the performance of dialogue systems. In this work, we propose a natural language understanding model with a specific-shopping named entity recognizer using a joint learning-based BERT transformer for task-oriented dialogue systems in the Persian Language. Since there is no published available dataset for Persian online shopping dialogue systems, to tackle the lack of data, we propose two methods for generating training data: fully-simulated and semi-simulated method. We created a simulated dataset with a hybrid of rule-based and template-based generation methods and a semi-simulated dataset where the language generation part is done by a human to increase the quality of the dataset. Our experiments with the natural language understanding module show that a combination of the datasets can improve results. These dataset generation methods can apply in other domains for low-resource languages in task-oriented dialogue systems too to solve the cold start problem of datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call