Extracting named entities from Russian-language documents with different expressiveness of structure

Maria D Averina,Olga A Levanova

doi:10.18255/1818-1015-2023-4-382-393

Extracting named entities from Russian-language documents with different expressiveness of structure

Maria D Averina, Olga A Levanova

Open Access

https://doi.org/10.18255/1818-1015-2023-4-382-393

Copy DOI

Journal: Modeling and Analysis of Information Systems	Publication Date: Dec 11, 2023
License type: cc-by

Affiliation: Yaroslavl State University

#CRF Model #Optimization Algorithms + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This work is devoted to solving the problem of recognizing named entities for Russian-language texts based on the CRF model. Two sets of data were considered: documents on refinancing with a good document structure, semi-structured texts of court records. The model was tested under various sets of text features and CRF parameters (optimization algorithms). In average for all entities, the best F-measure value for structured documents was 0.99, and for semi-structured ones 0.86.

Full Text