Abstract

Most of the current research on Named Entity Recognition (NER) in the Chinese domain is based on the assumption that annotated data are adequate. However, in many scenarios, the sufficient amount of annotated data required for Chinese NER task is difficult to obtain, resulting in poor performance of machine learning methods. In view of this situation, this paper tries to excavate the information contained in the massive unlabeled raw text data and utilize it to enhance the performance of Chinese NER task. A deep learning model combined with Transfer Learning technique is proposed in this paper. This method can be leveraged in some domains where there is a large amount of unlabeled text data and a small amount of annotated data. The experiment results show that the proposed method performs well on different sized datasets, and this method also avoids errors that occur during the word segmentation process. We also evaluate the effect of transfer learning from different aspects through a series of experiments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call