Abstract

Answer generation is one of the most important tasks in natural language processing, and deep learning-based methods have shown their strength over traditional machine learning based methods. However, most previous deep learning-based answer generation models were built on traditional recurrent neural networks or convolutional neural networks. The former model cannot well exploit contextual correlation preserved in paragraphs due to their inherent computation complexity. For the latter, since the size of the convolutional kernel is fixed, the model cannot extract complete semantic information features. In order to alleviate this problem, based on multi-layer Transformer aggregation coder, we propose an end-to-end answer generation model (AG-MTA). AG-MTA consists of a multi-layer attention Transformer unit and a multi-layer attention Transformer aggregation encoder (MTA). It can focus on information representation at different positions and aggregate nodes at same layer to combine the context information. Thereby, it fuses semantic information from base layer to top layer, enhancing the information representation of the encoder. Furthermore, based on trigonometric function, a novel position encoding method is also proposed. Experiments are conducted on public datasets SQuAD. AG-MTA reaches the state-of-the-art performance, EM score achieves 71.1 and F1 score achieves 80.3.

Highlights

  • Question answering(Q&A) system is built on the basis of understanding of the questions

  • In order to enhancing the relevance of contextual information, we propose a novel multi-layer attention Transformer aggregation encoder (MTA), and a novel answer generation network based on MTA encoder (AG-MTA)

  • In this paper, we propose an end-to-end model for answer generation based on multi-layer Transformer aggregation coder

Read more

Summary

INTRODUCTION

Question answering(Q&A) system is built on the basis of understanding of the questions. Recent method [4] combines the CNN network and attention mechanism for Chinese question classification, which boosts the effect of the answer generation. Most of the current research work is based on typical neural networks to deal with tasks such as intent classification and answer generation. In order to enhancing the relevance of contextual information, we propose a novel multi-layer attention Transformer aggregation encoder (MTA), and a novel answer generation network based on MTA encoder (AG-MTA). A multi-layer attention Transformer aggregation encoder (MTA) is proposed to utilize contextual information at different layers to model the sequences.

RELATED WORK
MULTI-LAYER ATTENTION TRANSFORMER UNIT
MULTI-LAYER ATTENTION TRANSFORMER AGGREGATION ENCODER
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.