Abstract

Named Entity Recognition (NER) is a typical sequence labeling problem as a foundation of text information processing, which has gradually played a key role in the technology of natural language processing (NLP). Also, it is widely applied to handle enormous sub-tasks of NLP. The current methods mainly use BERT and other powerful deep learning components to extract the semantic information of sentences from complex texts. Then, they perform sequence labeling to identify entity information. However, such methods usually encounter the problem of overfitting. Moreover, those methods always show poor performance in generalization ability and robustness. Compared with such methods, to extract effective entities, we propose a named entity recognition model based on adversarial training. For the encoding layer, we present Bidirectional Encoder Representations From Transformers (BERT) as the pre-trained language model to get the word vector to enrich the semantic representation. After combining the word vector obtained by BERT and the word vector obtained by the glove, we investigate a method of Bi-LSTM in the feature extraction layer for training to adapt to the disturbance and recognize the named entity. Our core innovation is introducing the Fast Gradient Method (FGM) to generate adversarial examples for the adversarial attack. The adversarial attack would add disturbance data to the encoding layer. In this way, we successfully strengthen the abilities of both generalization and robustness, thereby improving the model's performance. We conduct experiments on widely used NER datasets on Chinese Resume NER for the model we proposed. Additionally, the experimental results show that our model has achieved some comparable results on recognizing named entities. The Precision rate, Recall rate, and F1 score obtained are respectively 0.9541,0.9538,0.9536.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.