Abstract

Text generation often requires high-precision output that obeys task-specific rules. This fine-grained control is difficult to enforce with off-the-shelf deep learning models. In this work, we consider augmenting neural generation models with discrete control states learned through a structured latent-variable approach. Under this formulation, task-specific knowledge can be encoded through a range of rich, posterior constraints that are effectively trained into the model. This approach allows users to ground internal model decisions based on prior knowledge, without sacrificing the representational power of neural generative models. Experiments consider applications of this approach for text generation. We find that this method improves over standard benchmarks, while also providing fine-grained control.

Highlights

  • A core challenge in using deep learning for NLP is developing methods that allow for controlled output while maintaining the broad coverage of data-driven methods

  • The methodology utilizes posterior regularization within a structured variational framework. We show that this approach can induce a fully autoregressive neural model that is as expressive as standard neural decoders and utilizes meaningful discrete control states

  • We show this decoder is effective for text generation while inducing meaningful discrete representations

Read more

Summary

Introduction

A core challenge in using deep learning for NLP is developing methods that allow for controlled output while maintaining the broad coverage of data-driven methods. While this issue is less problematic in classification tasks, it has hampered the deployment of systems for conditional natural language generation (NLG), where users often need to control output through task-specific knowledge or plans. Consider the case of encoder-decoder models for generation, built with RNNs or transformers. These models generate fluent output and provide flexible representations of their conditioning. Auto-regressive decoders are globally dependent, which makes it challenging to incorporate domain constraints

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.