An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Lifu Tu,Tianyu Liu,Kevin Gimpel

doi:10.18653/v1/2020.emnlp-main.449

Abstract

Many tasks in natural language processing involve predicting structured outputs, e.g., sequence labeling, semantic role labeling, parsing, and machine translation. Researchers are increasingly applying deep representation learning to these problems, but the structured component of these approaches is usually quite simplistic. In this work, we propose several high-order energy terms to capture complex dependencies among labels in sequence labeling, including several that consider the entire label sequence. We use neural parameterizations for these energy terms, drawing from convolutional, recurrent, and self-attention networks. We use the framework of learning energy-based inference networks (Tu and Gimpel, 2018) for dealing with the difficulties of training and inference with such models. We empirically demonstrate that this approach achieves substantial improvement using a variety of high-order energy terms on four sequence labeling tasks, while having the same decoding speed as simple, local classifiers. We also find high-order energies to help in noisy data conditions.

Highlights

Conditional random fields (CRFs; Lafferty et al, 2001) have been shown to perform well in various sequence labeling tasks
While the optimal energy function varies by task, we find strong performance from skip-chain terms with short skip distances, convolutional networks with filters that consider label trigrams, and recurrent networks and self-attention networks that consider large subsequences of labels
Here we find that the framework of SPEN learning with inference networks can support a wide range of high-order energies for sequence labeling

Summary

Introduction

Conditional random fields (CRFs; Lafferty et al, 2001) have been shown to perform well in various sequence labeling tasks. A major challenge with CRFs is the complexity of training and inference, which are quadratic in the number of output labels for first order models and grow exponentially when higher order dependencies are considered. This explains why the most common type of CRF used in practice is a first order model, referred to as a “linear chain” CRF. Enlarging the inference network architecture by adding one layer leads consistently to better results, rivaling or improving over a BiLSTM-CRF baseline, suggesting that training efficient inference networks with high-order energy terms can make up for errors arising from approximate inference. While we focus on sequence labeling in this paper, our results show the potential of developing high-order structured models for other NLP tasks in the future

Structured Energy-Based Learning

Inference Networks

An Objective for Joint Learning of Inference Networks

Energy Functions

Linear Chain Energies

Skip-Chain Energies

High-Order Energies

Fully-Connected Energies

Related Work

Datasets

Training

Results

Results on Noisy Datasets

Incorporating BERT

Analysis of Learned Energies

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 43	License type: cc-by

Similar Papers

Enhanced sequence labeling based on latent variable conditional random fields
Jerry Chun-Wei Lin ... Yinan Shao
Neurocomputing | VOL. 403
Jerry Chun-Wei Lin, et. al.Jerry Chun-Wei Lin ... Yinan Shao
08 May 2020
Neurocomputing | VOL. 403

Optic Disc and Cup Image Segmentation Utilizing Contour-Based Transformation and Sequence Labeling Networks.
Yuanyuan Yang ... Rong Shu
Journal of medical systems | VOL. 44
Yuanyuan Yang, et. al.Yuanyuan Yang ... Rong Shu
20 Mar 2020
Journal of medical systems | VOL. 44

Truth Discovery in Sequence Labels from Crowds
Qi Li ... Nasim Sabetpour
-
Qi Li, et. al.Qi Li ... Nasim Sabetpour
01 Dec 2021
01 Dec 2021

Towards improving the robustness of sequential labeling models against typographical adversarial examples using triplet loss
Can Udomcharoenchaikit ... Peerapon Vateekul
Natural Language Engineering | VOL. 29
Can Udomcharoenchaikit, et. al.Can Udomcharoenchaikit ... Peerapon Vateekul
04 Feb 2022
Natural Language Engineering | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers