Abstract

AbstractNamed entity recognition has a variety of applications in journalism, where it may extract relevant information from voluminous daily news reports. However, its applicability is limited since there is no word vector learning and existing models are complicated. This paper is based on the lightweight Natural Language Processing model (ALBERT) dynamic word vector generation model proposed by Google. The model was combined with Bidirectional Long Short-Term Memory Network (LSTM) and Conditional Random Field (CRF) to form the ALBERT-BiLSTM-CRF model. This paper applies the ALBERT-BiLSTM-CRF model with the 2014 edition of the People’s Daily published on the Internet as the primary data set to compare the traditional statistical model and the classic NLP model. The experimental results show that the ALBERT-BiLSTM-CRF has a comparative advantage over the classic natural language processing (NLP) model. The proposed model can increase the recognition accuracy and recall rate of named entities in the news. The model's accuracy and recall on the test dataset attained 94.49 and 89.50 percent, respectively, while the model's volume allows for lightweight deployment.KeywordsChinese named entity recognitionConditional random fieldALBERT modelDynamic word vector

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.