Abstract

Named entity recognition (NER) aims to extract entities from unstructured text, and a nested structure often exists between entities. However, most previous studies paid more attention to flair named entity recognition while ignoring nested entities. The importance of words in the text should vary for different entity categories. In this paper, we propose a head-to-tail linker for nested NER. The proposed model exploits the extracted entity head as conditional information to locate the corresponding entity tails under different entity categories. This strategy takes part of the symmetric boundary information of the entity as a condition and effectively leverages the information from the text to improve the entity boundary recognition effectiveness. The proposed model considers the variability in the semantic correlation between tokens for different entity heads under different entity categories. To verify the effectiveness of the model, numerous experiments were implemented on three datasets: ACE2004, ACE2005, and GENIA, with F1-scores of 80.5%, 79.3%, and 76.4%, respectively. The experimental results show that our model is the most effective of all the methods used for comparison.

Highlights

  • Named entity recognition (NER) is a fundamental task in natural language processing, aiming to extract entities with pre-defined categories from unstructured texts

  • Compared to another boundary-based NER model, Boundaryaware [20], HTLinker extracts entities better on GENIA: F1-score, precision, and recall are higher by 1.7%, 0.1%, and 3.2%, respectively

  • Compared with identifying entity category labels using entity boundary information as a condition, HTLinker is more effective in identifying entity tails under different entity categories by inputting entity heads as conditional information

Read more

Summary

Introduction

Named entity recognition (NER) is a fundamental task in natural language processing, aiming to extract entities with pre-defined categories from unstructured texts. Constructing effective NER models is essential for some downstream tasks, such as entity linking [1,2,3], relation extraction [4,5,6,7], event extraction [8,9], and question answering [10]. Traditional NER models [11,12,13,14,15] are usually based on a single-layer sequence labeling approach, which often assigns one label to each token. Not all entities in the text exist independently, and nested structures may exist between different entities. The nested structure in entities is both realistic and will further improve the accuracy of the model in extracting entities

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call