Abstract

Most of the researches on Chinese named entity recognition (NER) focus on the general field, and few on NER in the field of science and technology. On one hand, technical terms in the field of science and technology appear in general texts less frequently, with most of which being compound words, and the performance of word segmentation processing on texts in the field of science and technology is poor. On the other hand, texts in the field of science and technology are more accurate and standardized than those in general fields. By analyzing these characteristics of texts in the field of science and technology, this paper attempts to train word vectors by constructing terminology dictionaries and introducing dependency analysis. Referring to the latest NER research results in the current Chinese general field, i.e. the method of merging character vectors with word vectors, we will perform NER on texts in the field of science and technology. Through experiments, it is proved that the proposed method is more effective comparing to existing works. In addition, the effect of introducing attention mechanism on NER results is also studied.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call