Abstract

In many machine learning scenarios, supervision by gold labels is not available and conse quently neural models cannot be trained directly by maximum likelihood estimation. In a weak supervision scenario, metric-augmented objectives can be employed to assign feedback to model outputs, which can be used to extract a supervision signal for training. We present several objectives for two separate weakly supervised tasks, machine translation and semantic parsing. We show that objectives should actively discourage negative outputs in addition to promoting a surrogate gold structure. This notion of bipolarity is naturally present in ramp loss objectives, which we adapt to neural models. We show that bipolar ramp loss objectives outperform other non-bipolar ramp loss objectives and minimum risk training on both weakly supervised tasks, as well as on a supervised machine translation task. Additionally, we introduce a novel token-level ramp loss objective, which is able to outperform even the best sequence-level ramp loss on both weakly supervised tasks.

Highlights

  • Sequence-to-sequence neural models are standardly trained using a maximum likelihood estimation (MLE) objective

  • In our first task of semantic parsing, question-answer pairs provide a weak supervision signal to find parses that execute to the correct answer

  • We show that ramp loss can outperform minimum risk training (MRT) if it incorporates bipolar supervision where parses that receive negative feedback are actively discouraged

Read more

Summary

Introduction

Sequence-to-sequence neural models are standardly trained using a maximum likelihood estimation (MLE) objective. MLE training requires full supervision by gold target structures, which in many scenarios are too difficult or expensive to obtain. There are many domains for which no gold references exist, crosslingual document-level links are present for many multilingual data collections. In this paper we investigate methods where a supervision signal for output structures can be extracted from weak feedback. We use learning from weak feedback, or weakly supervised learning, to refer to a scenario where output structures generated by the model are judged according to an external metric, and this feedback is used to extract a supervision signal that guides the learning process. Metric-augmented sequence-level objectives from reinforcement learning (Williams, 1992; Ranzato et al, 2016), minimum risk training (MRT) (Smith and Eisner, 2006; Shen et al, 2016) or margin-based structured prediction objectives (Taskar et al, 2005; Edunov et al, 2018) can be seen as instances of such algorithms

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call