Abstract

Chinese part-of-speech (POS) tagging is an essential task for Chinese downstream natural language processing tasks. The accuracy of the Chinese POS task will drop dramatically by word-based methods because of the segmentation errors and the word sparsity. Also, there are several Chinese POS tagging sets with different criteria. Some of them only have a small-scale annotated corpus and are hard to train. To this end, we propose a modified word-based transformer neural network architecture. Meanwhile, we utilize an adversarial transfer learning method that splits the architecture into shared and private parts. This work directly improves the ability of the word-based model, instead of adopting a joint character-based method. Extensive experiments show that our method achieves state-of-the-art performance on all datasets, and more importantly, our method improves performance effectively for the word-based Chinese sequence labeling task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call