Densely-Connected Transformer with Co-attentive Information for Matching Text Sequences

Minxu Zhang,Yuesheng Zhu,Kai Lei,Bin Cui,Yingxia Shao

doi:10.1007/978-3-030-60290-1_18

Abstract

Sentence matching, which aims to capture the semantic relationship between two sequences, is a crucial problem in NLP research. It plays a vital role in various natural language tasks such as question answering, natural language inference and paraphrase identification. The state-of-the-art works utilize the interactive information of sentence pairs through adopting the general Compare-Aggregate framework and achieve promising results. In this study, we propose Densely connected Transformer to perform multiple matching processes with co-attentive information to enhance the interaction of sentence pairs in each matching process. Specifically, our model consists of multiple stacked matching blocks. Inside each block, we first employ a transformer encoder to obtain refined representations for two sequences, then we leverage multi-way co-attention mechanism or multi-head co-attention mechanism to perform word-level comparison between the two sequences, the original representations and aligned representations are fused to form the alignment information of this matching layer. We evaluate our proposed model on five well-studied sentence matching datasets and achieve highly competitive performance.

Full Text