Chinese Automatic Text Simplification Based on Unsupervised Learning

Yang Sen,Yang Fuping

doi:10.1109/iaeac50856.2021.9390937

Abstract

In this paper, a Chinese automatic text simplification(ATS) method based on unsupervised learning was introduced. Automatic text simplification is a research field of natural language processing. In terms of Chinese texts, the reliance on the hand-made simplified corpus or dictionary is not applicable due to a large number of texts. Chinese is a diverse language, and numerous factors need to be taken into consideration. An automatic simplification method based on Chinese text and a readability formula based on linear regression was proposed in this paper. Based on our method, just input a set of Chinese sentences and the more comprehensible sentences can be obtained through syntactic simplification and lexical simplification. Through the automatic evaluation of the hand-made simplified corpus, the readability score of our system increased by 3.68 compared with that of the original text, and the SARI score reached 36.02.

Full Text