Abstract

How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS), paraphrase identification (PI) and textual entailment (TE). Most prior work (i) deals with one individual task by fine-tuning a specific system; (ii) models each sentence’s representation separately, rarely considering the impact of the other sentence; or (iii) relies fully on manually designed, task-specific linguistic features. This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences. We make three contributions. (i) The ABCNN can be applied to a wide variety of tasks that require modeling of sentence pairs. (ii) We propose three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart. These interdependent sentence pair representations are more powerful than isolated sentence representations. (iii) ABCNNs achieve state-of-the-art performance on AS, PI and TE tasks. We release code at: https://github.com/yinwenpeng/Answer_Selection .

Highlights

  • How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS) (Yu et al, 2014; Feng et al, 2015), paraphrase identification (PI) (Madnani et al, 2012; Yin and Schutze, 2015a), textual entailment (TE) (Marelli et al, 2014a; Bowman et al, 2015a) etc

  • We introduce our basic Convolutional Neural Networks (CNNs) that is based on the Siamese architecture (Bromley et al, 1993), i.e., it consists of two weightsharing CNNs, each processing one of the two sentences, and a final layer that solves the sentence pair task

  • Comparing the Attention Based Convolutional Neural Network (ABCNN)-2 with the ABCNN-1, we find the ABCNN-2 is slightly better even though the ABCNN-2 is the simpler architecture

Read more

Summary

Introduction

How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS) (Yu et al, 2014; Feng et al, 2015), paraphrase identification (PI) (Madnani et al, 2012; Yin and Schutze, 2015a), textual entailment (TE) (Marelli et al, 2014a; Bowman et al, 2015a) etc. Most prior work derives each sentence’s representation separately, rarely considering the impact of the other sentence. This neglects the mutual influence of the two sentences in the context of the task. It contradicts what humans do when comparing two sentences. Human beings model the two sentences together, using the content of one sentence to guide the representation of the other

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.