Automated Paraphrase Quality Assessment Using Language Models and Transfer Learning

Bogdan Nicula,Danielle S Mcnamara,Natalie N Newton,Mihai Dascalu,Ellen Orcutt

doi:10.3390/computers10120166

Abstract

Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used to provide timely feedback to enhance learner paraphrasing skills more efficiently and effectively. Paraphrase identification is a popular NLP classification task that involves establishing whether two sentences share a similar meaning. Paraphrase quality assessment is a slightly more complex task, in which pairs of sentences are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, and overall quality. Our study introduces and evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks using BiLSTM RNNs, and pretrained BERT-based models, together with transfer learning from a larger general paraphrase corpus, to estimate the quality of paraphrases across the four dimensions. Two datasets are considered for the tasks involving paraphrase quality: ULPC (User Language Paraphrase Corpus) containing 1998 paraphrases and a smaller dataset with 115 paraphrases based on children’s inputs. The paraphrase identification dataset used for the transfer learning task is the MSRP dataset (Microsoft Research Paraphrase Corpus) containing 5801 paraphrases. On the ULPC dataset, our BERT model improves upon the previous baseline by at least 0.1 in F1-score across the four dimensions. When using fine-tuning from ULPC for the children dataset, both the BERT and Siamese neural network models improve upon their original scores by at least 0.11 F1-score. The results of these experiments suggest that transfer learning using generic paraphrase identification datasets can be successful, while at the same time obtaining comparable results in fewer epochs.

Highlights

Paraphrases range widely in terms of definitions, from concise text constructs that are “similar enough in meaning” [1] to more philosophical implications, as paraphrases provide “differing textual realizations of the same meaning” [2]
We analyzed the performance of models trained on the User Language Paraphrase Corpus (ULPC) dataset and tested on the children dataset in order to observe their capability to generalize out-ofthe-box
The Siamese network (SN) and Bidirectional Encoder Representations from Transformers (BERT)-based models pretrained on the ULPC dataset were fine-tuned on the children dataset

Summary

Introduction

Paraphrases range widely in terms of definitions, from concise text constructs that are “similar enough in meaning” [1] to more philosophical implications, as paraphrases provide “differing textual realizations of the same meaning” [2]. A paraphrase is a restatement of a text generated with different words, normally with the aim of providing clarity. The ability to paraphrase becomes vital, especially for young learners. Encouraging readers to transform a source text into more familiar words and phrases helps them better understand the text by activating relevant prior knowledge, as they develop a textbase model of what was explicitly conveyed in the text [3]. Learning to paraphrase facilitates both reading comprehension and writing ability, for less skilled readers and writers [4,5,6]. An inability to generate a paraphrase is a clear indicator that the reader is struggling with comprehension [7]. Learning how to effectively paraphrase provides a crucial foundation for students to master other skills that enhance reading comprehension, such as bridging and elaboration [4]

Objectives

Methods

Results

Discussion

Conclusion