Abstract

AbstractIn this paper, the different models for the estimation of both local and global coherence of Ukrainian-language texts have been considered. In order to evaluate the local coherence of a document, Transformer-based and LSTM neural networks have been proposed with further training on a Ukrainian-language news corpus. It has been shown that the LSTM-based approach outperforms the corresponding network based on the Transformer architecture according to the accuracy metrics while solving typical tasks on both test datasets. In order to investigate the connection between sentences revealed by the neural network, the Uniform Manifold Approximation and Projection dimension reduction technique has been utilized for the projection of sentences’ embedding into 2D space. The clusters obtained may indicate the consideration of both the structure of a sentence and different types of connections between them by the designed model. In order to estimate the global coherence of a document, a model based on a graph convolutional neural network has been suggested. The appropriateness of taking into account the connection between all sentences despite their positions has been shown. The results obtained for the designed and trained global coherence estimation model may indicate the different aspects of the analysis of a text by the designed models that can lead to the usage of both local and global coherence estimation models according to an assigned task.KeywordsLocal and global coherence of a documentTransformer-based neural networkSentence embeddingGraph convolutional networkUkrainian corpora

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.