Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Nuo Qun,Xi-Peng Qiu,Xuan-Jing Huang,Hang Yan

doi:10.1007/s11390-020-9576-4

Abstract

Semi-Markov conditional random fields (Semi-CRFs) have been successfully utilized in many segmentation problems, including Chinese word segmentation (CWS). The advantage of Semi-CRF lies in its inherent ability to exploit properties of segments instead of individual elements of sequences. Despite its theoretical advantage, Semi-CRF is still not the best choice for CWS because its computation complexity is quadratic to the sentence’s length. In this paper, we propose a simple yet effective framework to help Semi-CRF achieve comparable performance with CRF-based models under similar computation complexity. Specifically, we first adopt a bi-directional long short-term memory (BiLSTM) on character level to model the context information, and then use simple but effective fusion layer to represent the segment information. Besides, to model arbitrarily long segments within linear time complexity, we also propose a new model named Semi-CRF-Relay. The direct modeling of segments makes the combination with word features easy and the CWS performance can be enhanced merely by adding publicly available pre-trained word embeddings. Experiments on four popular CWS datasets show the effectiveness of our proposed methods. The source codes and pre-trained embeddings of this paper are available on https://github.com/fastnlp/fastNLP/ .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Abstract

Talk to us

Similar Papers

More From: Journal of Computer Science and Technology

Lead the way for us

Journal: Journal of Computer Science and Technology	Publication Date: Sep 30, 2020
Citations: 13

Similar Papers

Is Local Window Essential for Neural Network Based Chinese Word Segmentation?
Jinchao Zhang ... Qun Liu
-
Jinchao Zhang, et. al.Jinchao Zhang ... Qun Liu
01 Jan 2015
01 Jan 2015

DGeoSegmenter: A dictionary-based Chinese word segmenter for the geoscience domain
Qinjun Qiu ... Wenjia Li
Computers & Geosciences | VOL. 121
Qinjun Qiu, et. al.Qinjun Qiu ... Wenjia Li
07 Sep 2018
Computers & Geosciences | VOL. 121

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

Chinese Word Segmentation and Recognition Based on Separable Convolution Bidirectional Long Short-Term Memory and Feature Point
...
-
, et. al. ...
18 Dec 2020
18 Dec 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Chinese Word Segmentation via BiLSTM+Semi-CRF with Relay Node

Abstract

Talk to us

Similar Papers

More From: Journal of Computer Science and Technology