SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling

Luoxin Chen,Jianhua Lu,Xinyue Liu,Weitong Ruan

doi:10.18653/v1/2020.acl-main.777

Abstract

Virtual adversarial training (VAT) is a powerful technique to improve model robustness in both supervised and semi-supervised settings. It is effective and can be easily adopted on lots of image classification and text classification tasks. However, its benefits to sequence labeling tasks such as named entity recognition (NER) have not been shown as significant, mostly, because the previous approach can not combine VAT with the conditional random field (CRF). CRF can significantly boost accuracy for sequence models by putting constraints on label transitions, which makes it an essential component in most state-of-the-art sequence labeling model architectures. In this paper, we propose SeqVAT, a method which naturally applies VAT to sequence labeling models with CRF. Empirical studies show that SeqVAT not only significantly improves the sequence labeling performance over baselines under supervised settings, but also outperforms state-of-the-art approaches under semi-supervised settings.

Highlights

While having achieved great success on various computer vision and natural language processing tasks, deep neural networks, even state-of-the-art models, are usually vulnerable to tiny input perturbations (Szegedy et al, 2014; Goodfellow et al, 2015)
Our evaluation demonstrates that SeqVAT brings significant improvements in supervised settings, rather than marginal improvements reported from previous virtual adversarial training (VAT)-based approaches Clark et al
We adapt the neural-conditional random field (CRF) architecture by a CNNLSTM-CRF model, which consists of one convolutional neural network (CNN) layer to generate character embeddings, two layers of bidirectional long short-term memory (LSTM) as the encoder and a CRF layer as the decoder

Summary

Introduction

While having achieved great success on various computer vision and natural language processing tasks, deep neural networks, even state-of-the-art models, are usually vulnerable to tiny input perturbations (Szegedy et al, 2014; Goodfellow et al, 2015). To apply VAT on sequence labeling, Clark et al (2018) proposed to use a softmax layer on the top of token representations to obtain label probability distributions for each token In this fashion, VAT could take KL divergence between tokens at the same position of the original sequence and the adversarial sequence as the adversarial losses. To apply the conventional VAT on a model with CRF, one can calculate the KL divergence on the label distribution of each token between the original examples and adversarial examples It is sub-optimal because the transition probabilities are not taken into account. In the semi-supervised settings, SeqVAT outperforms many widely used methods such as self-training (ST) (Yarowsky, 1995) and entropy minimization (EM) (Grandvalet and Bengio, 2004), as well as the state-of-the-art semisupervised sequence labeling algorithm, cross-view training (CVT) (Clark et al, 2018)

Sequence Labeling

Semi-Supervised Learning

Virtual Adversarial Training

Method

Model Architecture

Word Embeddings

Character CNN Layer

CRF Layer

Adversarial Training

SeqVAT

Training with Adversarial Loss

Experiment Settings

Dataset

Supervised Sequence Labeling

Semi-Supervised Sequence Labeling

K-best Selection in SeqVAT

Impact of Unlabeled Data

Comparison on Semi-Supervised Approaches

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 54	License type: cc-by

Similar Papers

An approach for medical event detection in Chinese clinical notes of electronic health records
Xuesi Zhou ... Haoqi Xiong
BMC Medical Informatics and Decision Making | VOL. 19
Xuesi Zhou, et. al.Xuesi Zhou ... Haoqi Xiong
01 Apr 2019
BMC Medical Informatics and Decision Making | VOL. 19

Virtual Adversarial Training Applied to Neural Higher-Order Factors for Phone Classification
Martin Ratajczak ... Franz Pernkopf
-
Martin Ratajczak, et. al.Martin Ratajczak ... Franz Pernkopf
08 Sep 2016
08 Sep 2016

Sequence Labeling of Chinese Text Based on Bidirectional Gru-Cnn-Crf Model
Di Liu ... Xinyi Zou
-
Di Liu, et. al.Di Liu ... Xinyi Zou
01 Dec 2018
01 Dec 2018

UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection
Andrei Paraschiv ... Mihai Dascalu
-
Andrei Paraschiv, et. al.Andrei Paraschiv ... Mihai Dascalu
01 Jan 2020
UPB at SemEval-2021 Task 5: Virtual Adversarial Training for Toxic Spans Detection
Andrei Paraschiv ... Mihai Dascalu

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SeqVAT: Virtual Adversarial Training for Semi-Supervised Sequence Labeling

Abstract

Highlights

Summary

Talk to us

Similar Papers