Improving edit-based unsupervised sentence simplification using fine-tuned BERT

Mohammad Amin Rashid,Hossein Amirkhani

doi:10.1016/j.patrec.2023.01.009

Abstract

Word suggestion in unsupervised sentence simplification aims to replace complex words of a given sentence with their simpler alternatives. This is mostly done without considering their context within the input sentence. In this paper, we propose a technique that brings context awareness to word suggestion by merging pre-trained BERT models with a successful edit-based unsupervised sentence simplification model. More importantly, we show that only by fine-tuning the BERT model on simple English corpora, simplification results can be improved and even outperform some of the competing supervised methods. Finally, we introduce a framework that involves filtering an arbitrary amount of unlabeled in-domain text for tuning the model in situations where labeled data, as simple and complex, is scarce. This preprocessing step also speeds up the training process by ignoring fine-tuning on unnecessary samples.

Full Text