Application of Knowledge Distillation Based on Transfer Learning of ERNIE Model in Intelligent Dialogue Intention Recognition.

Shiguang Guo,Qing Wang

doi:10.3390/s22031270

Abstract

The ‘intention’ classification of a user question is an important element of a task-engine driven chatbot. The essence of a user question’s intention understanding is the text classification. The transfer learning, such as BERT (Bidirectional Encoder Representations from Transformers) and ERNIE (Enhanced Representation through Knowledge Integration), has put the text classification task into a new level, but the BERT and ERNIE model are difficult to support high QPS (queries per second) intelligent dialogue systems due to computational performance issues. In reality, the simple classification model usually shows a high computational performance, but they are limited by low accuracy. In this paper, we use knowledge of the ERNIE model to distill the FastText model; the ERNIE model works as a teacher model to predict the massive online unlabeled data for data enhancement, and then guides the training of the student model of FastText with better computational efficiency. The FastText model is distilled by the ERNIE model in chatbot intention classification. This not only guarantees the superiority of its original computational performance, but also the intention classification accuracy has been significantly improved.

Highlights

IntroductionIntention classification is essential to the performance of a task-oriented chatbot
Chatbots have become more widely used in industries and smart cities with the rapid development of Artificial Intelligence
This paper applied the teacher-student distillation model, as well as proposed the teacher model based on ERNIE

Summary

Introduction

Intention classification is essential to the performance of a task-oriented chatbot. With the rapid development of deep learning technology, neural networks have been applied to text classification technology. Sensors 2022, 22, 1270 with the rapid development of deep learning technology, neural networks have been applied to text classification technology. It is usually computationally expensive when the chatbot handles a large number of real customers online; the simple intention classification model can show a beneficial computational performance but is limited to the low classification. Through the knowledge distillation method, the teacher model ERNIE is used to predict intention labels on a large amount of online data, and the student model FastText can be distilled through the enhanced data of large-scale corpus from the teacher model. Several different text classification models, such as LSTM and BERT, are performed to the same task for comparison

Pre-Trained Model

Experiment and Result Analysis

Findings

Conclusions