A Unified Knowledge Extraction Method Based on BERT and Handshaking Tagging Scheme

Ning Yang,Sio Hang Pun,Yifan Yang,Mang I Vai,Qingliang Miao

doi:10.3390/app12136543

Ning Yang, Sio Hang Pun + Show 3 more

Open Access

https://doi.org/10.3390/app12136543

Copy DOI

Abstract

In the actual knowledge extraction system, different applications have different entity classes and relationship schema, so the generalization and migration ability of knowledge extraction are very important. By training a knowledge extraction model in the source domain and applying the model to an arbitrary target domain directly, open domain knowledge extraction technology becomes crucial to mitigate the generalization and migration ability issues. Traditional knowledge extraction models cannot be directly transferred to new domains and also cannot extract undefined relation types. In order to deal with the above issues, in this paper, we proposed an end-to-end Chinese open-domain knowledge extraction model, TPORE (Extract Open-domain Relations through Token Pair linking), which combined BERT with a handshaking tagging scheme. TPORE can alleviate the nested entities and nested relations issues. Additionally, a new loss function that conducts a pairwise comparison of target category score and non-target category score to automatically balance the weight was adopted, and the experiment results indicate that the loss function can bring speed and performance improvements. The extensive experiments demonstrate that the proposed method can significantly surpass strong baselines. Specifically, our approach can achieve new state-of-the-art Chinese open Relation Extraction (ORE) benchmarks (COER and SAOKE). In the COER dataset, F1 increased from 66.36% to 79.63%, and in the SpanSAOKE dataset, F1 increased from 46.0% to 54.91%. In the medical domain, our method can obtain close performance compared with the SOTA method in the CMeIE and CMeEE datasets.

Full Text