Evaluation of the Coherence of Polish Texts Using Neural Network Models

Sergii Telenyk,Sergiy Pogorilyy,Artem Kramov

doi:10.3390/app11073210

Sergii Telenyk, Sergiy Pogorilyy + Show 1 more

Open Access

https://doi.org/10.3390/app11073210

Copy DOI

Abstract

Coherence evaluation of texts falls into a category of natural language processing tasks. The evaluation of texts’ coherence implies the estimation of their semantic and logical integrity; such a feature of a text can be utilized during the solving of multidisciplinary tasks (SEO analysis, medicine area, detection of fake texts, etc.). In this paper, different state-of-the-art coherence evaluation methods based on machine learning models have been analyzed. The investigation of the effectiveness of different methods for the coherence estimation of Polish texts has been performed. The impact of text’s features on the output coherence value has been analyzed using different approaches of a semantic similarity graph. Two neural networks based on LSTM layers and a pre-trained BERT model correspondingly have been designed and trained for the coherence estimation of input texts. The results obtained may indicate that both lexical and semantic components should be taken into account during the coherence evaluation of Polish documents; moreover, it is advisable to analyze corresponding documents in a sentence-by-sentence manner taking into account word order. According to the retrieved accuracy of the proposed neural networks, it can be concluded that suggested models may be used in order to solve typical coherence estimation tasks for a Polish corpus.

Highlights

The natural language processing (NLP) area incorporates different tasks that are connected with the automatic analysis of text information by utilizing the means of computer linguistics and machine learning: text generation, information extraction, speech analysis, etc
In the case of the preceding adjacent vertex (PAV) approach, the increase of both metrics is tracked during the increase of the regulative parameter α till the reach of the peak with α = 0.6
The LSTM-based model (LSTM) cells allow the performing of the vector representation of either sentences or entire texts according to items position

Summary

Introduction

The natural language processing (NLP) area incorporates different tasks that are connected with the automatic analysis of text information by utilizing the means of computer linguistics and machine learning: text generation, information extraction, speech analysis, etc. Of textual coherence, namely, distinguishing coherent documents from incoherent ones [1], refers to this kind of task. The coherence of a text can be considered as the set of procedures that provide its cognitive integrity. Such procedures involve logical connections between cause and effect, condition and result. The coherence provides the consistency of text data with background knowledge. The coherent document is easier to read and understand than incoherent ones.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Apr 2, 2021
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Evaluation of the Coherence of Polish Texts Using Neural Network Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Learning What Makes Catalysts Good
Nongnuch Artrith
Matter | VOL. 3
Nongnuch ArtrithNongnuch Artrith
01 Oct 2020
Matter | VOL. 3

Manuscripts Character Recognition Using Machine Learning and Deep Learning
Mohammad Anwarul Islam ... Ionut E Iacob
Modelling | VOL. 4
Mohammad Anwarul Islam, et. al.Mohammad Anwarul Islam ... Ionut E Iacob
04 Apr 2023
Modelling | VOL. 4

Quantum-based machine learning and AI models to generate force field parameters for drug-like small molecules.
Sathish Kumar Mudedla ... Abdennour Braka
Frontiers in Molecular Biosciences | VOL. 9
Sathish Kumar Mudedla, et. al.Sathish Kumar Mudedla ... Abdennour Braka
11 Oct 2022
Frontiers in Molecular Biosciences | VOL. 9

Predicting baseball pitcher efficacy using physical pitch characteristics
Tejas Oberoi ... Sam Saarinen
Journal of Emerging Investigators | VOL. -
Tejas Oberoi, et. al.Tejas Oberoi ... Sam Saarinen
01 Jan 2024
Journal of Emerging Investigators | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of the Coherence of Polish Texts Using Neural Network Models

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences