Abstract

The prevalent approaches for the task of Chinese word segmentation almost rely on the Bi-LSTM neural network. However, the methods based the Bi-LSTM have an inherent drawback: the Vanishing Gradients, which cause the little efficient in capturing the faraway character information of a long sentence for the task of word segmentation. In this work, we propose a novel sequence-to-sequence transformer model for Chinese word segmentation, which is premised a type of convolutional neural network named temporal convolutional network. The model uses the temporal convolutional network to construct an encoder, and uses a fully-connected neural network to build a decoder, and applies the Viterbi algorithm to build an inference layer to infer the final result of the Chinese word segmentation. Meanwhile, the model captures the faraway character information of a long sentence by adding the layers of the encoder. For achieving a superior result of word segmentation, the model binds the Conditional Random Fields model to train parameters. The experiments on the Chinese corpus show that the performance of Chinese word segmentation of the model is better than the Bi-LSTM model, and the model has a better ability to process a long sentence than the Bi-LSTM.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call