Abstract

Organization name is a kind of frequently occurring but ever-changing proper nouns in texts. Chinese organization name recognition is a non-trivial task in named entity recognition (NER). Comparing with other entities such as person and location, Chinese organization name is the most difficult to be identified. Currently statistic-based approach for automatic NER is widely studied. In this paper, we try to make clear several puzzling problems of statistic-based Chinese organization name recognition and propose experimental conclusions. Whether the encoding scheme in the recognition system by classification approach affects the performance and how much? Should we build one identification model for all different named entities or one- for-each? Or whether Chinese organization name recognition after person and location identification outperforms the parallel approach or not? Which is better, word-based or character-based Chinese organization recognition? Our conclusions are drawn on corpora of SIGHAN Bakeoff datasets for NER.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.