A joint model for entity boundary detection and entity span recognition

Nian Yongming,Chen Yanping,Qin Yongbin,Huang Ruizhang,Tang Ruixue,Hu Ying

doi:10.1016/j.jksuci.2022.08.016

Nian Yongming, Chen Yanping + Show 4 more

Open Access

https://doi.org/10.1016/j.jksuci.2022.08.016

Copy DOI

Abstract

Named entity recognition is a task to extract named entities with predefined entity types. Span classification is a popular method to support this task. It has the advantage to solve nested structures and make full use of token features in a span. The problem is that exhaustively enumerating and verifying all entity spans suffer from high computational complexity and data imbalance. Furthermore, spans with a high overlapping ratio share the same contextual features in a sentence, which is easy to lead to false positive errors caused by inaccurate entity boundaries. In this paper, we present a model to detect the entity boundaries and predict entity candidates jointly. Instead of labeling tokens, our model makes the prediction based on gap representations between words, which avoids the ambiguity when a token has several labels. We also propose a neighborhood span proposal strategy to generate reasonable negative samples for training, which effectively reduces the data imbalance problem. Our model is evaluated on the ACE2005 and GENIA corpora. It achieves performance close to the state-of-the-art in F1 scores of 88.55% and 79.81%, respectively.

Full Text