Abstract

Named entity recognition (NER) is aimed at obtaining the important information from the unstructured data presented in the form of natural language texts. In this work, we investigate the efficiency of modern multi-task NER approaches on Russian language corpora by employing several different NER datasets and a dataset of part-of-speech (POS) tags. We apply a state-of-the-art neural architecture based on bidirectional LSTMs and conditional random fields. Convolutional neural networks were utilized to learn character-level features. We carry out extensive experimental evaluation over three standard datasets of news articles written in Russian. The proposed multi-task model achieve state-of-the-art results with an F1 score of 88.04% on Gareev’s dataset and an F1 score of 99.49% on Person-1000 dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call