Natural language generation from Universal Dependencies using data augmentation and pre-trained language models

Dang Tuan Nguyen,Trung Tran

doi:10.1504/ijiids.2023.128292

Abstract

Natural language generation (NLG) has focused on data-to-text tasks with different structured inputs in recent years. The generated text should contain given information, be grammatically correct, and meet other criteria. We propose in this research an approach that combines solid pre-trained language models with input data augmentation. The studied data in this work are Universal Dependencies (UDs) which is developed as a framework for consistent annotation of grammar (parts of speech, morphological features and syntactic dependencies) for cross-lingual learning. We study the English UD structures, which are modified into two groups. In the first group, the modification phase is to remove the order information of each word and lemmatise the tokens. In the second group, the modification phase is to remove the functional words and surface-oriented morphological details. With both groups of modified structures, we apply the same approach to explore how pre-trained sequence-to-sequence models text-to-text transfer transformer (T5) and BART perform on the training data. We augment the training data by creating several permutations for each input structure. The result shows that our approach can generate good quality English text with the exciting idea of studying strategies to represent UD inputs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Natural language generation from Universal Dependencies using data augmentation and pre-trained language models

Abstract

Talk to us

Similar Papers

More From: International Journal of Intelligent Information and Database Systems

Lead the way for us

Similar Papers

Natural language generation from Universal Dependencies using data augmentation and pre-trained language models
Trung Tran ... Dang Tuan Nguyen
International Journal of Intelligent Information and Database Systems | VOL. 16
Trung Tran, et. al.Trung Tran ... Dang Tuan Nguyen
01 Jan 2023
International Journal of Intelligent Information and Database Systems | VOL. 16

Pre-Trained Language Models for Text Generation: A Survey
Junyi Li ... Jian-Yun Nie
ACM Computing Surveys | VOL. 56
Junyi Li, et. al.Junyi Li ... Jian-Yun Nie
25 Apr 2024
ACM Computing Surveys | VOL. 56

Research on the Application of Prompt Learning Pretrained Language Model in Machine Translation Task with Reinforcement Learning
Canjun Wang ... Zhengyu Ju
Electronics | VOL. 12
Canjun Wang, et. al.Canjun Wang ... Zhengyu Ju
09 Aug 2023
Electronics | VOL. 12

Understanding latent affective bias in large pre-trained neural language models
Anoop Kadan ... Lajish V.L
Natural Language Processing Journal | VOL. 7
Anoop Kadan, et. al.Anoop Kadan ... Lajish V.L
05 Mar 2024
Natural Language Processing Journal | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Natural language generation from Universal Dependencies using data augmentation and pre-trained language models

Abstract

Talk to us

Similar Papers

More From: International Journal of Intelligent Information and Database Systems