Abstract
Named Entity Recognition (NER) aims to extract structured entity information from unstructured textual data by identifying entity boundaries and categories. Chinese NER is more challenging than that of English due to the complex structure and ambiguous word boundaries, as well as nested and discontinuous occurrences of entities. Previous Chinese NER methods are limited by their character-based approach and dependence on external lexical information, which is often non-contextualized, leading to the introduction of noise and potentially compromising model performance. This paper proposes a novel Chinese NER model, HiNER, which leverages external semantic enhancement and hierarchical attention fusion. Specifically, we initially formulate the Chinese NER as a character-character relation classification task, thoroughly taking into account the cases of nested and discontinuous entities. Then, by incorporating syntactic information, we develop a Triformer module that is used to better integrate Chinese character, lexical, and syntactic embeddings, carefully considering the impact of external semantic enhancement on the original text embeddings and reducing extrinsic information interference to some extent. In addition, through the fusion of local and global attention mechanisms, the representation of character-character relationships is enhanced, allowing for the effective capture of semantic features at various hierarchical levels within the Chinese context. We conduct extensive experiments on seven Chinese NER datasets, and the results indicate that the HiNER model achieves state-of-the-art(SOTA) performance. The outcomes also confirm that external semantic enhancement and hierarchical attention fusion can provide better assistance in accomplishing the Chinese NER task.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.