Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which hinders the accuracy of entity recognition. To this end, a Chinese NER model based on multi-level representation learning is proposed. The model leverages a pre-trained word-based embedding to capture contextual information. A linear layer adjusts dimensions to fit an Extended Long Short-Term Memory (XLSTM) network, enabling the capture of long-range dependencies and contextual information, and providing deeper representations. An adaptive multi-head attention mechanism is proposed to enhance the ability to capture global dependencies and comprehend deep semantic context. Additionally, GlobalPointer with rotational position encoding integrates global information for entity category prediction. Projected Gradient Descent (PGD) is incorporated, introducing perturbations in the embedding layer of the pre-trained model to enhance stability in noisy environments. The proposed model achieves F1-scores of 96.89%, 74.89%, 72.19%, and 80.96% on the Resume, Weibo, CMeEE, and CLUENER2020 datasets, respectively, demonstrating improvements over baseline and comparison models.
Read full abstract