Is Local Window Essential for Neural Network Based Chinese Word Segmentation?

Jinchao Zhang,Mingxuan Wang,Fandong Meng,Qun Liu,Daqi Zheng,Wenbin Jiang

doi:10.1007/978-3-319-47674-2_37

Abstract

Neural network based Chinese Word Segmentation (CWS) approaches can bypass the burdensome feature engineering comparing with the conventional ones. All previous neural network based approaches rely on a local window in character sequence labelling process. It can hardly exploit the outer context and may preserve indifferent inner context. Moreover, the size of local window is a toilsome manual-tuned hyper-parameter that has significant influence on model performance. We are wondering if the local window can be discarded in neural network based CWS. In this paper, we present a window-free Bi-directional Long Short-term Memory (Bi-LSTM) neural network based Chinese word segmentation model. The model takes the whole sentence under consideration to generate reasonable word sequence. The experiments show that the Bi-LSTM can learn sufficient context for CWS without the local window.

Full Text