Abstract

This paper discusses a major issue in computational linguistics; the automatic calculation of text coherence. Heretofore, only few methods have been proposed to automatically detect local coherence of texts. All of these methods need a lot of pre-processing tasks and computational efforts. Here we suggest a simple method to evaluate the coherence globally. First, we use a word ranking method to assign an importance value to each word-type in a text, then the importance time series associated with text is constructed. In the next step, Detrended Fluctuation Analysis(DFA) which is used for detecting inherent correlations in time series, is applied to texts importance time series. We found that the importance time series exhibits a bi-scale behavior; it is long-range correlated at large distances, while short-range correlations are observed in small distances. We also observed that for a shuffled text the scaling exponent decreases. This decrease becomes more and more significant when we reshuffle the chapters, paragraphs, sentences and words respectively. This fact leads us to consider the scaling exponent of text time series (or briefly STT) as a measure for quantifying the global coherence. We demonstrate our claim by carrying out an experiment on three sample texts and comparing our method by some entity grid based models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.