Extraction of the Relations among Significant Pharmacological Entities in Russian-Language Reviews of Internet Users on Medications

Alexander Sboev,Ivan Moloshnikov,Roman Rybka,Sanna Sboeva,Gleb Rylkov,Artem Gryaznov,Anton Selivanov

doi:10.3390/bdcc6010010

Alexander Sboev, Ivan Moloshnikov + Show 5 more

Open Access

https://doi.org/10.3390/bdcc6010010

Copy DOI

Abstract

Nowadays, the analysis of digital media aimed at prediction of the society’s reaction to particular events and processes is a task of a great significance. Internet sources contain a large amount of meaningful information for a set of domains, such as marketing, author profiling, social situation analysis, healthcare, etc. In the case of healthcare, this information is useful for the pharmacovigilance purposes, including re-profiling of medications. The analysis of the mentioned sources requires the development of automatic natural language processing methods. These methods, in turn, require text datasets with complex annotation including information about named entities and relations between them. As the relevant literature analysis shows, there is a scarcity of datasets in the Russian language with annotated entity relations, and none have existed so far in the medical domain. This paper presents the first Russian-language textual corpus where entities have labels of different contexts within a single text, so that related entities share a common context. therefore this corpus is suitable for the task of belonging to the medical domain. Our second contribution is a method for the automated extraction of entity relations in Russian-language texts using the XLM-RoBERTa language model preliminarily trained on Russian drug review texts. A comparison with other machine learning methods is performed to estimate the efficiency of the proposed method. The method yields state-of-the-art accuracy of extracting the following relationship types: ADR–Drugname, Drugname–Diseasename, Drugname–SourceInfoDrug, Diseasename–Indication. As shown on the presented subcorpus from the Russian Drug Review Corpus, the method developed achieves a mean F1-score of 80.4% (estimated with cross-validation, averaged over the four relationship types). This result is 3.6% higher compared to the existing language model RuBERT, and 21.77% higher compared to basic ML classifiers.

Highlights

The developing ecosystem of social networks and other special Internet platforms expands the possibility of discussion of a broad set of topics in textual format
Summarizing the above, it can be concluded that the current trend in identifying relationships between named entities is the use of models with transformer architecture pretrained on large datasets. We develop this approach based on the XLMRoBERTa language model [35] using the Russian Drug Review Corpus (RDRS) [3] described in Section 3.1 and available at the Sagteam project website
The comparison shows that the language model should receive both the target entities separated from the text and the entire text in order to achieve high accuracy and to outperform basic machine learning methods

Summary

Introduction

The developing ecosystem of social networks and other special Internet platforms expands the possibility of discussion of a broad set of topics in textual format. These texts often contain people’s publicly available opinions on various subjects. One of the topics of special interest is Internet reviews on medications, including information about their positive and adverse effects, qualities, manufacturers, administration regime etc. Such information could be useful for comprehensive analysis for the purposes of pharmacovigilance [1] and potential medicine re-profiling.

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Big Data and Cognitive Computing	Publication Date: Jan 17, 2022
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Extraction of the Relations among Significant Pharmacological Entities in Russian-Language Reviews of Internet Users on Medications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Big Data and Cognitive Computing

Lead the way for us

Similar Papers

Inter-sentence Relation Extraction Based on Relation-level Attention Mechanism
Qi Wang ... Bihui Yu
-
Qi Wang, et. al.Qi Wang ... Bihui Yu
01 Oct 2020
01 Oct 2020

Accuracy Analysis of the End-to-End Extraction of Related Named Entities from Russian Drug Review Texts by Modern Approaches Validated on English Biomedical Corpora
Alexander Sboev ... Alexander Naumov
Mathematics | VOL. 11
Alexander Sboev, et. al.Alexander Sboev ... Alexander Naumov
09 Jan 2023
Mathematics | VOL. 11

RoBERT-Agr: An Entity Relationship Extraction Model of Massive Agricultural Text Based on the RoBERTa and CRF Algorithm
Tianyue Chen ... Xiang Li
-
Tianyue Chen, et. al.Tianyue Chen ... Xiang Li
03 Mar 2023
03 Mar 2023

Relation Extraction from Texts Containing Pharmacologically Significant Information on base of Multilingual Language Models
Anton Aleksandrovich Selivanov ... Roman Rybka
-
Anton Aleksandrovich Selivanov, et. al.Anton Aleksandrovich Selivanov ... Roman Rybka
14 Nov 2022
14 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extraction of the Relations among Significant Pharmacological Entities in Russian-Language Reviews of Internet Users on Medications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Big Data and Cognitive Computing