Research on Task-oriented Dialogue Based on Modified Transformer

Zongli Jiang,Shuo Zhang

doi:10.1088/1742-6596/1544/1/012188

Abstract

The traditional end-to-end task-oriented dialogue models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. But in the case of large amounts of data, there are many types of questions. It performs poorly when answering multiple types of questions, memory information cannot effectively record all the sentence information of the context. In view of the above this, this article uses a modified transformer model to overcome the problems mentioned in dialogue tasks. Transformer is a model constructed using attention mechanisms, which completely discards the method of RNN (recurrent neural networks), and its structure includes two sub-parts of Encoder and decoder. It uses residual network, batch normalization, and self-attention mechanism to build the model structure, uses Positional Encoding to capture sentence information, which can speed up model training convergence and capture Longer sentence information. In this paper, we modified the activation function in the transformer and use label smoothing to optimize the training to make the model’s expressive ability better than previous.

Full Text