Abstract
Automatic domain term extraction (ADTE) has an important significance in natural language processing, and it is widely applied in information retrieval, information extraction, data mining, machine translation and other information processing fields. In this paper, an automatic domain term extraction method is proposed based on condition random fields (CRF). We treat domain terms extraction as a sequence labeling problem, and terms' distribution characteristics as features of the CRF model. Then we used the CRF tool to train a template for the term extraction. Experimental results showed that the method is simple, with common domains, and good results were achieved. In the open test, the precision rate achieved was 73.24%, recall rate was 69.57%, and F-measure was 71.36%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.