Abstract

In legal texts, named entity recognition (NER) is researched using deep learning models. First, the bidirectional (Bi)-long short-term memory (LSTM)-conditional random field (CRF) model for studying NER in legal texts is established. Second, different annotation methods are used to compare and analyze the entity recognition effect of the Bi-LSTM-CRF model. Finally, other objective loss functions are set to compare and analyze the entity recognition effect of the Bi-LSTM-CRF model. The research results show that the F1 value of the model trained on the word sequence labeling corpus on the named entity is 88.13%, higher than that of the word sequence labeling corpus. For the two types of entities, place names and organization names, the F1 values obtained by the Bi-LSTM-CRF model using word segmentation are 67.60% and 89.45%, respectively, higher than the F1 values obtained by the model using character segmentation. Therefore, the Bi-LSTM-CRF model using word segmentation is more suitable for recognizing extended entities. The parameter learning result using log-likelihood is better than that using the maximum interval criterion, and it is ideal for the Bi-LSTM-CRF model. This method provides ideas for the research of legal text recognition and has a particular value.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call