A deep network model for paraphrase detection in short text messages

Basant Agarwal,Heri Ramampiaro,Helge Langseth,Massimiliano Ruocco

doi:10.1016/j.ipm.2018.06.005

Abstract

This paper is concerned with paraphrase detection, i.e., identifying sentences that are semantically identical. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Recognizing this importance, we study in particular how to address the challenges with detecting paraphrases in user generated short texts, such as Twitter, which often contain language irregularity and noise, and do not necessarily contain as much semantic information as longer clean texts. We propose a novel deep neural network-based approach that relies on coarse-grained sentence modelling using a convolutional neural network (CNN) and a recurrent neural network (RNN) model, combined with a specific fine-grained word-level similarity matching model. More specifically, we develop a new architecture, called DeepParaphrase, which enables to create an informative semantic representation of each sentence by (1) using CNN to extract the local region information in form of important n-grams from the sentence, and (2) applying RNN to capture the long-term dependency information. In addition, we perform a comparative study on state-of-the-art approaches within paraphrase detection. An important insight from this study is that existing paraphrase approaches perform well when applied on clean texts, but they do not necessarily deliver good performance against noisy texts, and vice versa. In contrast, our evaluation has shown that the proposed DeepParaphrase-based approach achieves good results in both types of texts, thus making it more robust and generic than the existing approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Information Processing & Management	Publication Date: Jun 30, 2018
Citations: 105	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

A deep network model for paraphrase detection in short text messages

Abstract

Talk to us

Similar Papers

More From: Information Processing & Management

Lead the way for us

Similar Papers

A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts
Muhammad Haroon Shakeel ... Imdadullah Khan
Information Processing & Management | VOL. 57
Muhammad Haroon Shakeel, et. al.Muhammad Haroon Shakeel ... Imdadullah Khan
15 Jan 2020
Information Processing & Management | VOL. 57

Service-oriented model-based fault prediction and localization for service compositions testing using deep learning techniques
Roaa Elghondakly ... Nagwa Badr
Applied Soft Computing | VOL. 143
Roaa Elghondakly, et. al.Roaa Elghondakly ... Nagwa Badr
18 May 2023
Applied Soft Computing | VOL. 143

MuscleNET: mapping electromyography to kinematic and dynamic biomechanical variables by machine learning
Ali Nasr ... Rachel L Whittaker
Journal of Neural Engineering | VOL. 18
Ali Nasr, et. al.Ali Nasr ... Rachel L Whittaker
01 Aug 2021
Journal of Neural Engineering | VOL. 18

Tunnel boring machine vibration-based deep learning for the ground identification of working faces
Mengbo Liu ... Yanqing Men
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13
Mengbo Liu, et. al.Mengbo Liu ... Yanqing Men
01 Dec 2021
Journal of Rock Mechanics and Geotechnical Engineering | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A deep network model for paraphrase detection in short text messages

Abstract

Talk to us

Similar Papers

More From: Information Processing &amp; Management

More From: Information Processing & Management