Abstract

Paraphrases refer to texts that convey the same meaning with different expression forms. Pivot-based methods, also known as the round-trip translation, have shown promising results in generating high-quality paraphrases. However, existing pivot-based methods all rely on language as the pivot, where large-scale, high-quality parallel bilingual texts are required. In this paper, we explore the feasibility of using semantic and syntactic representations as the pivot for paraphrase generation. Concretely, we transform a sentence into a variety of different semantic or syntactic representations (including AMR, UD, and latent semantic representation), and then decode the sentence back from the semantic representations. We further explore a pretraining-based approach to compress the pipeline process into an end-to-end framework. We conduct experiments comparing different approaches with different kinds of pivots. Experimental results show that taking AMR as pivot can obtain paraphrases with better quality than taking language as the pivot. The end-to-end framework can reduce semantic shift when language is used as the pivot. Besides, several unsupervised pivot-based methods can generate paraphrases with similar quality as the supervised sequence-to-sequence model, which indicates that parallel data of paraphrases may not be necessary for paraphrase generation.

Highlights

  • P (Y |X) = P (Z|X)P (Y |Z), where Z denotes the pivot of X

  • Existing pivot-based methods all Paraphrase generation is an important and challenging task in the field of Natural Language Processing (NLP), which can be applied in a variety of applications such as information retrieval (Yan et al, 2016), question answering (Fader et al, 2014; Yin et al, 2015), machine translation (Cho et al, 2014), and so on

  • We explore the feasibility of using different pivots for pivot-based paraphrasing models, including syntactic representation (Universal Dependencies (McDonald et al, 2013), UD), semantic representation (Abstract Meaning Representation (Banarescu et al, 2013), AMR), and latent semantic representation (LSR)

Read more

Summary

Introduction

Choose Z as representations in a different language, the quality of the generated paraphrases largely depends on the pre-existing machine translation system. Choosing language as pivot has some disadvantages, for example: (1) the pipeline translations may incur semantic shift (Guo et al, 2019), and (2) machine translation systems are sensitive to domain, and the quality of translating out-of-domain sentences can not be guaranteed. We explore the feasibility of using different pivots for pivot-based paraphrasing models, including syntactic representation (Universal Dependencies (McDonald et al, 2013), UD), semantic representation (Abstract Meaning Representation (Banarescu et al, 2013), AMR), and latent semantic representation (LSR). Meaning Representation (AMR) (Banarescu et al, 2013) is a rooted, labeled, acyclic graph which abstracts away from syntax and preserves semantics. Since AMR only keeps semantic information, paraphrases can share the same AMR graph

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.