In view of the fact that entity nested and professional terms are difficult to identify in the field of power dispatch, a multi-task-based few-shot named entity recognition model (FSPD-NER) for power dispatch is proposed. The model consists of four modules: feature enhancement, seed, expansion, and implication. Firstly, the masking strategy of the encoder is improved by adopting whole-word masking, using a RoBERTa (Robustly Optimized BERT Pretraining Approach) encoder as the embedding layer to obtain the text feature representation, and an IDCNN (Iterated Dilated CNN) module to enhance the feature. Then the text is cut into one Chinese character and two Chinese characters as a seed set, the score for each seed is calculated, and if the score is greater than the threshold value ω, they are passed to the expansion module as candidate seeds; next, the candidate seeds need to be expanded left and right according to offset γ to obtain the candidate entities; finally, to construct text implication pairs, the input text is used as a premise sentence, the candidate entity is connected with predefined label templates as hypothesis sentences, and the implication pairs are passed to the RoBERTa encoder for the classification task. The focus loss function is used to alleviate label imbalance during training. The experimental results of the model on the power dispatch dataset show that the precision, recall, and F1 scores of the recognition results in 20-shot samples are 63.39%, 61.97%, and 62.67%, respectively, which is a significant performance improvement compared to existing methods.
Read full abstract