Representation transfer and data cleaning in multi-views for text simplification

Wei He,Katayoun Farrahi,Bin Chen,Bohua Peng,Aline Villavicencio

doi:10.1016/j.patrec.2023.11.011

Abstract

Representation transfer is a widely used technique in natural language processing. We propose methods of cleaning the dominant dataset of text simplification (TS) WikiLarge in multi-views to remove errors that impact model training and fine-tuning. The results show that our method can effectively refine the dataset. We propose to take the pre-trained text representations from a similar task (e.g., text summarization) to text simplification to conduct a continue-fine-tuning strategy to improve the performance of pre-trained models on TS. This approach will speed up the training and make the model convergence easier. Besides, we also propose a new decoding strategy for simple text generation. It is able to generate simpler and more comprehensible text with controllable lexical simplicity. The experimental results show that our method can achieve good performance on many evaluation metrics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Representation transfer and data cleaning in multi-views for text simplification

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters

Lead the way for us

Journal: Pattern Recognition Letters	Publication Date: Nov 10, 2023
License type: cc-by

Similar Papers

Review on Natural Language Processing Trends and Techniques Using NLTK
Deepa Yogish ... Ravindra S Hegadi
-
Deepa Yogish, et. al.Deepa Yogish ... Ravindra S Hegadi
01 Jan 2019
01 Jan 2019

A comprehensive investigation of natural language processing techniques and tools to generate automated test cases
Imran Ahsan ... Muhammad Waseem Anwar
-
Imran Ahsan, et. al.Imran Ahsan ... Muhammad Waseem Anwar
22 Mar 2017
22 Mar 2017

Increasing the accessibility of NLP techniques for Defence and Security using a web-based tool

-

19 Nov 2019
19 Nov 2019

User Stories and Natural Language Processing: A Systematic Literature Review
Indra Kharisma Raharjana ... Daniel Siahaan
IEEE Access | VOL. 9
Indra Kharisma Raharjana, et. al.Indra Kharisma Raharjana ... Daniel Siahaan
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Representation transfer and data cleaning in multi-views for text simplification

Abstract

Talk to us

Similar Papers

More From: Pattern Recognition Letters