Language Models in the Loop: Incorporating Prompting into Weak Supervision

Ryan Smith,Braden Hancock,Stephen H Bach,Jason A Fries

doi:10.1145/3617130

Abstract

We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Rather than apply the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework. To create a classifier, we first prompt the model to answer multiple distinct queries about an example and define how the possible responses should be mapped to votes for labels and abstentions. We then denoise these noisy label sources using the Snorkel system and train an end classifier with the resulting training data. Our experimental evaluation shows that prompting large language models within a weak supervision framework can provide significant gains in accuracy. On the WRENCH weak supervision benchmark, this approach can significantly improve over zero-shot performance, an average 19.5% reduction in errors. We also find that this approach produces classifiers with comparable or superior accuracy to those trained from hand-engineered rules.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Language Models in the Loop: Incorporating Prompting into Weak Supervision

Abstract

Talk to us

Similar Papers

More From: ACM / IMS Journal of Data Science

Lead the way for us

Similar Papers

Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
Anoop K ... Lajish V L
-
Anoop K, et. al. Anoop K ... Lajish V L
01 Jan 2021
01 Jan 2021

Jigsaw
Naman Jain ... Arun Iyer
-
Naman Jain, et. al.Naman Jain ... Arun Iyer
21 May 2022
21 May 2022

A Large and Diverse Arabic Corpus for Language Modeling
Abbas Raza Ali ... Hasan Raza Ali
Procedia Computer Science | VOL. 225
Abbas Raza Ali, et. al.Abbas Raza Ali ... Hasan Raza Ali
01 Jan 2023
Procedia Computer Science | VOL. 225

Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports
Haitham A Elmarakeby ... Kenneth L Kehl
BMC bioinformatics | VOL. 24
Haitham A Elmarakeby, et. al.Haitham A Elmarakeby ... Kenneth L Kehl
02 Sep 2023
BMC bioinformatics | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Language Models in the Loop: Incorporating Prompting into Weak Supervision

Abstract

Talk to us

Similar Papers

More From: ACM / IMS Journal of Data Science