Abstract

Previous studies normally formulate named entity recognition (NER) as a sequence labeling task and optimize the solution in the sentence level. In this article, we propose a document-level optimization approach to NER and apply it in a domain-specific document-level NER task. As a baseline, we apply a state-of-the-art approach, i.e., long-short-term memory (LSTM), to perform word classification. On this basis, we define a global objective function with the obtained word classification results and achieve global optimization via Integer Linear Programming (ILP). Specifically, in the ILP-based approach, we propose four kinds of constraints, i.e., label transition, entity length, label consistency, and domain-specific regulation constraints, to incorporate various entity recognition knowledge in the document level. Empirical studies demonstrate the effectiveness of the proposed approach to domain-specific document-level NER.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call