Low Resource Sequence Tagging with Weak Labels

Edwin Simpson,Iryna Gurevych,Jonas Pfeiffer

doi:10.1609/aaai.v34i05.6415

Abstract

Current methods for sequence tagging depend on large quantities of domain-specific training data, limiting their use in new, user-defined tasks with few or no annotations. While crowdsourcing can be a cheap source of labels, it often introduces errors that degrade the performance of models trained on such crowdsourced data. Another solution is to use transfer learning to tackle low resource sequence labelling, but current approaches rely heavily on similar high resource datasets in different languages. In this paper, we propose a domain adaptation method using Bayesian sequence combination to exploit pre-trained models and unreliable crowdsourced data that does not require high resource data in a different language. Our method boosts performance by learning the relationship between each labeller and the target task and trains a sequence labeller on the target domain with little or no gold-standard data. We apply our approach to labelling diagnostic classes in medical and educational case studies, showing that the model achieves strong performance though zero-shot transfer learning and is more effective than alternative ensemble methods. Using NER and information extraction tasks, we show how our approach can train a model directly from crowdsourced labels, outperforming pipeline approaches that first aggregate the crowdsourced data, then train on the aggregated labels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Low Resource Sequence Tagging with Weak Labels

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 3, 2020
Citations: 13

Similar Papers

An Empirical Study of Pre-trained Transformers for Arabic Information Extraction
Wuwei Lan ... Alan Ritter
-
Wuwei Lan, et. al.Wuwei Lan ... Alan Ritter
01 Jan 2020
01 Jan 2020

On Transfer Learning Techniques for Machine Learning

-

30 Apr 2020
30 Apr 2020

Transfer joint embedding for cross-domain named entity recognition
Sinno Jialin Pan ... Jian Su
ACM Transactions on Information Systems | VOL. 31
Sinno Jialin Pan, et. al.Sinno Jialin Pan ... Jian Su
01 May 2013
ACM Transactions on Information Systems | VOL. 31

A Transfer Learning-Based Multi-Instance Learning Method With Weak Labels.
Yanshan Xiao ... Fei Liang
IEEE Transactions on Cybernetics | VOL. 52
Yanshan Xiao, et. al.Yanshan Xiao ... Fei Liang
03 Mar 2020
IEEE Transactions on Cybernetics | VOL. 52

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Low Resource Sequence Tagging with Weak Labels

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence