Abstract

Chinese Named Entity Recognition (CNER) is an important sub topic in the field of Chinese Natural Language Processing, which plays an important role in multi tasks. However, it's difficult to determine the boundaries of entities in Chinese texts because the Chinese words are not naturally separated, which further causes the task of CNER much more difficult. In addition, the mainstream Named Entity Recognition (NER) is based on sequence tagging, which causes the cost of training set labeling very high, so many NER tasks are limited by training sets' deficiency. In this work, we propose a new CNER method based on adaptive incorporation of characters and words-CWAI to solve the problem of words information loss caused by lacking of words boundaries, which uses convolution neural network (CNN) to capture the local semantics for every character, and then adaptively calculates the weights of potential words that match a lexicon for each character based on attention mechanism between characters and words. And for the problem of limited model effects due to insufficient training set, we combined our model with pre-trained models to solve that.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.