Abstract

With the aim of solving the current problems of low utilization of entity features, multiple meanings of a word, and poor recognition of specialized terms in the Chinese power marketing domain named entity recognition (PMDNER), this study proposes a Chinese power marketing named entity recognition method based on whole word masking and joint extraction of dual features. Firstly, word vectorization of the electricity text data is performed using the RoBERTa pre-training model; then, it is fed into the constructed dual feature extraction neural network (DFENN) to acquire the local and global features of text in a parallel manner and fuse them. The output of the RoBERTa layer is used as the auxiliary classification layer, the output of the DFENN layer is used as the master classification layer, and the output of the two layers is dynamically combined through the attention mechanism to weight the outputs of the two layers so as to fuse new features, which are input into the conditional random field (CRF) layer to obtain the most reasonable label sequence. A focal loss function is used in the training process to alleviate the problem of uneven sample distribution. The experimental results show that the method achieved an F1 value of 88.58% on the constructed named entity recognition dataset in the power marketing domain, which is a significant improvement in performance compared with the existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call