Abstract

In natural language processing (NLP), paraphrase identification (PI) determines the relatedness between the pair of sentences having fewer or negligible lexical overlap but still pointing towards the same meaning. The major challenge faced while attempting to solve this problem is the many possible linguistic variations conveying the same purpose. This paper aims to provide a detailed survey of traditional similarity measures, statistical machine translation metrics, machine learning and deep learning techniques and a well-defined flow between them. This article encompasses various word embedding methods and step-wise derivation of its learning module. This survey paper also provides a definite flow pointing towards the evolution of deep learning in an unambiguous manner. A comparative analysis of various techniques to solve PI is presented and it will provide research directions to work in the similar domain.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.