Abstract

Text transcription is crucial in Chinese information processing. Text transcription has always existed since ancient times, but no matter whether it is manual transcription in ancient times or modern transcription using communication and storage devices, random errors cannot be avoided when a message has been forwarded and transcribed many times. In this paper, we study how to measure the size of differences between different versions of texts, how to estimate the number of transmissions experienced between two texts, and how to design an effective and fast algorithm for the calculation of the first two types of problems in the study of text transcription, with respect to the characteristics of text transcription. This paper proposes the concept of text similarity, constructs the TF-IDF similarity evaluation model of text, the text transmission evaluation model based on Gaussian process (i.e., GFCT Model), and the model based on the immune frog jumping algorithm to analyze the comparative processing of text, so as to achieve accurate and effective information processing, with a view to providing a new method for text data processing, and improving the accuracy and effectiveness of text data processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call