Abstract

Named Entity Recognition (NER) is the premise of other tasks in Information Extraction. At present, most NER studies are focus on person names, place names and organization names. However, domain entity recognition is still a challenging task. Tibetan culture domain entity recognition has important significance for studying Tibetan culture. This article extracts domain keywords based on improved TextRank algorithm. Then domain words bank is structured using domain keywords, and word segmentation is conducted. On the basis, Tibetan culture domain entities are recognized based on the improved Bootstrapping. The method in this article has better extracting performance and good generalization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call