Abstract

Location Entity Recognition (LER) is an important part in Named Entity Recognition (NER), and it is a significant research topic in this domain to use the abundant unlabeled corpus to improve recognition performance. A new method combined Active Learning with Self-Training is proposed, which selects samples based on confidence and 2-Gram frequency, and expands the training set by annotating the unlabeled corpus manually and automatically. The experiments reveal that the F-measure of this method is 8% higher than randomized Active Learning while the annotation is only 1/3 of the latter. And using this method, only 5% of characters in the extended training set need to be labeled to acquire a similar performance with complete manual annotation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.