Методы и подходы к автоматическому связыванию сущностей на русском языке

Anastasia Alekseevna Mezentseva,Tatiana Viktorovna Batura,Elena Pavlovna Bruches

doi:10.15514/ispras-2022-34(4)-13

Anastasia Alekseevna Mezentseva, Tatiana Viktorovna Batura + Show 1 more

Open Access

https://doi.org/10.15514/ispras-2022-34(4)-13

Copy DOI

Abstract

Nowadays, there is a growing interest in solving NLP tasks using external knowledge storage, for example, in information retrieval, question-answering systems, dialogue systems, etc. Thus it is important to establish relations between entities in the processed text and a knowledge base. This article is devoted to entity linking, where Wikidata is used as an external knowledge base. We consider scientific terms in Russian as entities. Traditional entity linking system has three stages: entity recognition, candidates (from knowledge base) generation, and candidate ranking. Our system takes raw text with the defined terms in it as input. To generate candidates we use string match between terms in the input text and entities from Wikidata. The candidate ranking stage is the most complicated one because it requires semantic information. Several experiments for the candidate ranking stage with different models were conducted, including the approach based on cosine similarity, classical machine learning algorithms, and neural networks. Also, we extended the RUSERRC dataset, adding manually annotated data for model training. The results showed that the approach based on cosine similarity leads to better results compared to others and doesn’t require manually annotated data. The dataset and system are open-sourced and available for other researchers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Методы и подходы к автоматическому связыванию сущностей на русском языке

Abstract

Talk to us

Similar Papers

More From: Proceedings of the Institute for System Programming of the RAS

Lead the way for us

Journal: Proceedings of the Institute for System Programming of the RAS	Publication Date: Jan 1, 2022
License type: cc-by

Similar Papers

Whose Knowledge, Whose Development? Use and Role of Local and External Knowledge in Agroforestry Projects in Bolivia
Johanna Jacobi ... Miguel Altieri
Environmental Management | VOL. 59
Johanna Jacobi, et. al.Johanna Jacobi ... Miguel Altieri
31 Dec 2016
Environmental Management | VOL. 59

Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering
Shamane Siriwardhana ... Rajib Rana
Transactions of the Association for Computational Linguistics | VOL. 11
Shamane Siriwardhana, et. al.Shamane Siriwardhana ... Rajib Rana
12 Jan 2023
Transactions of the Association for Computational Linguistics | VOL. 11

Question Answering Chatbot using Ontology for History of the Sumedang Larang Kingdom using Cosine Similarity as Similarity Measure
Rinaldi Jasmi ... Donni Richasdy
JURNAL MEDIA INFORMATIKA BUDIDARMA | VOL. 6
Rinaldi Jasmi, et. al.Rinaldi Jasmi ... Donni Richasdy
25 Oct 2022
JURNAL MEDIA INFORMATIKA BUDIDARMA | VOL. 6

Core techniques of question answering systems over knowledge bases: a survey
Dennis Diefenbach ... Vanessa Lopez
Knowledge and Information Systems | VOL. 55
Dennis Diefenbach, et. al.Dennis Diefenbach ... Vanessa Lopez
25 Sep 2017
Knowledge and Information Systems | VOL. 55

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Методы и подходы к автоматическому связыванию сущностей на русском языке

Abstract

Talk to us

Similar Papers

More From: Proceedings of the Institute for System Programming of the RAS