Abstract

Key population control is related to national security and social stability. Aiming at the difficulty of extracting key population information entities, this paper proposes a sequence labeling method based on dynamic word vectors. Replace the Word2vec pre-training model with the BERT model, which strengthens the feature extraction capabilities of the traditional entity extraction model, and more fully describes the multiple semantic and syntactic information of words. The improvement ideas for the BiLSTM-CRF model are as follows: Embed the BERT model upstream of the model, which is responsible for converting the original corpus into a dynamic vectorized representation, and the trained word vector is input into the BiLSTM layer for semantic encoding, and further mining the semantic related features of the entity context, and finally, the CRF layer outputs the sequence label with the maximum probability. After training with key population information as a data set, the F1 value of the model reached 0.90.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call