Semi-supervised Text Style Transfer: Cross Projection in Latent Space

Mingyue Shang,Lidong Bing,Shuming Shi,Piji Li,Rui Yan,Zhenxin Fu,Dongyan Zhao

doi:10.18653/v1/d19-1499

Abstract

Text style transfer task requires the model to transfer a sentence of one style to another style while retaining its original content meaning, which is a challenging problem that has long suffered from the shortage of parallel data. In this paper, we first propose a semi-supervised text style transfer model that combines the small-scale parallel data with the large-scale nonparallel data. With these two types of training data, we introduce a projection function between the latent space of different styles and design two constraints to train it. We also introduce two other simple but effective semi-supervised methods to compare with. To evaluate the performance of the proposed methods, we build and release a novel style transfer dataset that alters sentences between the style of ancient Chinese poem and the modern Chinese.

Highlights

The natural language generation (NLG) tasks have been attracting the growing attention of researchers, including response generation (Vinyals and Le, 2015), machine translation (Bahdanau et al, 2014), automatic summarization (Chopra et al, 2016), question generation (Gao et al, 2019), etc
Building such a style transfer model has long suffered from the shortage of parallel training data, since constructing the parallel corpus that could align the content meaning of different styles is costly and laborious, which makes it difficult to train in a supervised way
Considering the above-discussed issues, in this paper, instead of disentangling the input, we propose a differentiable encoder-decoder based model that contains a projection layer to build a bridge between the latent spaces of different styles

Summary

Introduction

The natural language generation (NLG) tasks have been attracting the growing attention of researchers, including response generation (Vinyals and Le, 2015), machine translation (Bahdanau et al, 2014), automatic summarization (Chopra et al, 2016), question generation (Gao et al, 2019), etc. As a fundamental attribute of text, style can have a broad and ambiguous scope, such as ancient poetry style v.s. modern language style and positive sentiment v.s. negative sentiment Building such a style transfer model has long suffered from the shortage of parallel training data, since constructing the parallel corpus that could align the content meaning of different styles is costly and laborious, which makes it difficult to train in a supervised way. One commonly used method is disentangling the style and content from the source sentence (John et al, 2018; Shen et al, 2017; Hu et al, 2017) For the input, they learn the representations of styles and style-independent content, expecting the later only keeps the content information. Lample et al (2019) illustrated that disentanglement is not a facile thing and the existing methods are not adequate to learn the styleindependent representations

Objectives

Methods

Results

Conclusion