Research on domain terminology recognition based on dependency tree-conditional random field

Yanyan Lin,Jieping Lu

doi:10.1088/1742-6596/1213/5/052076

Abstract

In view of the inconsistency of Chinese patent information in manual marking and classification, which leads to problems such as missed detection, partial detection and noise of patent search, this paper proposes a method based on the dependency tree-conditional random field(CRF) identification field terminology. The method is based on the modern grammar theory of dependency, using the existing technology to mark the dependency relationship. Finally, the corresponding technical feature words are identified in the results of the dependency labelling, and the training data is used as the training data to train the conditional random field model to identify the domain terminology. The experimental results show that the acquisition of training data through the dependency tree can improve the accuracy, recall and F value of the recognition results.

Full Text