Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

Chris Van Der Lee,Travis J Wiltshire,Chris Emmery,Emiel Krahmer,Thiago Castro Ferreira

doi:10.1162/coli_a_00484

Chris Van Der Lee, Travis J Wiltshire + Show 3 more

Open Access

https://doi.org/10.1162/coli_a_00484

Copy DOI

Abstract

AbstractThis study discusses the effect of semi-supervised learning in combination with pretrained language models for data-to-text generation. It is not known whether semi-supervised learning is still helpful when a large-scale language model is also supplemented. This study aims to answer this question by comparing a data-to-text system only supplemented with a language model, to two data-to-text systems that are additionally enriched by a data augmentation or a pseudo-labeling semi-supervised learning approach.Results show that semi-supervised learning results in higher scores on diversity metrics. In terms of output quality, extending the training set of a data-to-text system with a language model using the pseudo-labeling approach did increase text quality scores, but the data augmentation approach yielded similar scores to the system without training set extension. These results indicate that semi-supervised learning approaches can bolster output quality and diversity, even when a language model is also present.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

Abstract

Talk to us

Similar Papers

More From: Computational Linguistics

Lead the way for us

Journal: Computational Linguistics	Publication Date: Sep 1, 2023
License type: CC BY-NC-ND 4.0

Similar Papers

A real use case of semi-supervised learning for mammogram classification in a local clinic of Costa Rica.
Saul Calderon-Ramirez ... David Elizondo
Medical & Biological Engineering & Computing | VOL. 60
Saul Calderon-Ramirez, et. al.Saul Calderon-Ramirez ... David Elizondo
03 Mar 2022
Medical & Biological Engineering & Computing | VOL. 60

Automated three-dimensional reconstruction and morphological analysis of dendritic spines based on semi-supervised learning.
Peng Shi ... Jinsheng Hong
Biomedical Optics Express | VOL. 5
Peng Shi, et. al.Peng Shi ... Jinsheng Hong
17 Apr 2014
Biomedical Optics Express | VOL. 5

FSELM: fusion semi-supervised extreme learning machine for indoor localization with Wi-Fi and Bluetooth fingerprints
Xinlong Jiang ... Junfa Liu
Soft Computing | VOL. 22
Xinlong Jiang, et. al.Xinlong Jiang ... Junfa Liu
06 Apr 2018
Soft Computing | VOL. 22

A two-phase hybrid of semi-supervised and active learning approach for sequence labeling
Hamed Hassanzadeh ... Mohammadreza Keyvanpour
Intelligent Data Analysis | VOL. 17
Hamed Hassanzadeh, et. al.Hamed Hassanzadeh ... Mohammadreza Keyvanpour
17 Apr 2013
Intelligent Data Analysis | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

Abstract

Talk to us

Similar Papers

More From: Computational Linguistics