Abstract
Detecting intents and extracting necessary contextual information (aka named entities) in input utterances are two fundamental tasks in understanding what the users say in chatbot systems. While most work in this field has been dedicated to high-resource languages in popular domains like business and home automation, little research has been done for low-resource languages, especially in a less popular domain like education. To narrow this gap, this paper presents the first study on learning to detect student intents and to recognize named entities in the education domain targeted to the Vietnamese language. Specifically, we first introduce a complete corpus consisting of 3690 utterances of students. It was manually annotated with both named entities and intent information at two levels of granularity: the fine-grained and the coarse-grained levels. We then systematically investigate different approaches to deal with the two tasks using not only independent but also joint learning architectures. The experimental results show that the joint architectures based on pre-trained language models are superior in boosting the performance of both tasks. They outperformed the conventional independent learning architectures which looked at the two tasks separately. Moreover, to further enhance the final performance, this paper proposes a technique to enrich the models with more useful linguistic features. Compared to the standard approaches, we achieve considerably better results for two tasks in both architectures. Overall, for the named entity recognition task, the best model yielded an F1 score of 88.61%. For the intent detection task, it yielded F1 scores of 94.36% and 91.62% at the coarse-grained and fine-grained levels, respectively.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal on Artificial Intelligence Tools
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.