Auto-abstracting of texts in the Kazakh language

Diana Rakhimova,Aliya Turganbayeva

doi:10.1145/3410352.3410832

Abstract

In this article, the authors propose an approach for abstracting text resources and documents in the Kazakh language. Using software solutions to normalize texts in the Kazakh language, the text data developed by the scientific team of the authors of this work was prepared for further processing. Reviewing is based on keywords and phrases. To extract keywords and phrases, an algorithm is used TF-IDF algorithm to extract keywords and phrases from texts in the Kazakh language. To solve the problem, an approach based on machine learning was applied. To determine the similarity of the sentence, the cosine similarities of the data of the sentence are calculated, and thus the semantic content of the text is determined. When outputting text annotations, the volume of text is taken into account, that is, the amount of annotation depends on the volume of the document. Abstracting of texts in the Kazakh language is an urgent task of classification, clustering of text and information retrieval. The paper presents the results of experimental calculations for various approaches. The results of the study show that the presented approach is the best solution for extracting annotations from texts in the Kazakh language.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Auto-abstracting of texts in the Kazakh language

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Approach to Extract Keywords and Keyphrases of Text Resources and Documents in the Kazakh Language
Diana Rakhimova ... Aliya Turganbayeva
-
Diana Rakhimova, et. al.Diana Rakhimova ... Aliya Turganbayeva
01 Jan 2020
01 Jan 2020

Hybrid Approach for the Semantic Analysis of Texts in the Kazakh Language
Diana Rakhimova ... Leila Kopbosyn
-
Diana Rakhimova, et. al.Diana Rakhimova ... Leila Kopbosyn
01 Jan 2020
01 Jan 2020

COMPARATIVE EFFECTIVENESS OF RULE-BASED AND MACHINE LEARNING METHODS IN SENTIMENT ANALYSIS OF KAZAKH LANGUAGE TEXTS
Mukhtar Amirkumar ... Kamila Orynbekova
Scientific Journal of Astana IT University | VOL. -
Mukhtar Amirkumar, et. al.Mukhtar Amirkumar ... Kamila Orynbekova
20 May 2024
Scientific Journal of Astana IT University | VOL. -

PHOTOLUMINESCENT NANOMATERIALS FOR THERMOMETRY: SILICON AND CARBON NANOPARTICLES
G.K Mussabek ... G.K Sadykov
SERIES PHYSICO-MATHEMATICAL | VOL. 5
G.K Mussabek, et. al.G.K Mussabek ... G.K Sadykov
15 Oct 2021
SERIES PHYSICO-MATHEMATICAL | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Auto-abstracting of texts in the Kazakh language

Abstract

Talk to us

Similar Papers