Abstract

Controllable Text Generation (CTG) aims to modify the output of a Language Model (LM) to meet specific constraints. For example, in a customer service conversation, responses from the agent should ideally be soothing and address the user’s dissatisfaction or complaints. This imposes significant demands on controlling language model output. However, demerits exist among traditional methods. Promoting and fine-tuning language models exhibit the “hallucination” phenomenon and cannot guarantee complete adherence to constraints. Conditional language models (CLM), which map control codes into LM representations or latent space, require training the modified language models from scratch and a high amount of customized dataset is demanded. Decoding-time methods employ Bayesian Rules to modify the output of the LM or model constraints as a combination of energy functions and update the output along the low-energy direction. Both methods are confronted with the efficiency sampling problem. Moreover, there are no methods that consider the relation between constraints weights and the contexts, as is essential in actual applications such as customer service scenarios. To alleviate the problems mentioned above, we propose Controllable Text Generation with Generative Adversarial Networks (CTGGAN), which utilizes a language model with logits bias as the Generator to produce constrained text and employs the Discriminator with learnable constraint weight combinations to score and update the generation. We evaluate the method in the text completion task and Chinese customer service dialogues scenario, and our method shows superior performance in metrics such as PPL and Dist-3. In addition, CTGGAN also exhibits efficient decoding compared to other methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call