Abstract
In this paper, we present a fast and strong neural approach for general purpose text matching applications. We explore what is sufficient to build a fast and well-performed text matching model and propose to keep three key features available for inter-sequence alignment: original point-wise features, previous aligned features, and contextual features while simplifying all the remaining components. We conduct experiments on four well-studied benchmark datasets across tasks of natural language inference, paraphrase identification and answer selection. The performance of our model is on par with the state-of-the-art on all datasets with much fewer parameters and the inference speed is at least 6 times faster compared with similarly performed ones.
Highlights
Text matching is a core research area in natural language processing with a long history
Our proposed method achieves the performance on par with the state-of-the-art on four benchmark datasets across three different tasks, namely SNLI and SciTail for natural language inference, Quora Question Pairs for paraphrase identification, and WikiQA for answer selection
Since paraphrase identification is a symmetric task where two input sequences can be swapped with no effect to the label of the text pair, in hyperparameter tuning we validate between two symmet
Summary
Text matching is a core research area in natural language processing with a long history. A model takes two text sequences as input and predicts a category or a scala value indicating their relationship. A wide range of tasks, including natural language inference ( known as recognizing textual entailment) (Bowman et al, 2015; Khot et al, 2018), paraphrase identification (Wang et al, 2017), answer selection (Yang et al, 2015), and so on, can be seen as specific forms of text matching problems. Deep neural networks are the most popular choices for text matching nowadays. Semantic alignment and comparison of two text sequences are the keys in neural text matching. Many previous deep neural networks contain a single intersequence alignment layer. To make full use of this only alignment process, the model has to take rich external syntactic features or hand-designed alignment features as additional inputs of the alignment layer (Chen et al, 2017; Gong et al, 2018), adopt a complicated alignment mechanism (Wang et al, 2017; Tan et al, 2018), or build a vast amount of post-processing layers to analyze the alignment result (Tay et al, 2018b; Gong et al, 2018)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.