Research on Proofreading Method of Semantic Collocation Error in Chinese

Rui Zhang,Ruoyu Chen,Gaijuan Huang,Yangsen Zhang

doi:10.1007/978-3-030-78615-1_62

Abstract

With the rapid development of network technology and the popularization of electronic documents, Chinese text automatic proofreading technology has attracted increasing attention. Automatic proofreading of semantic errors in Chinese text is a key and difficult point in the field of Chinese information processing. Aiming at this problem, we propose a semantic error proofreading method that contains dependency parsing and statistical theory, and construct a two-layer semantic knowledge base to assist error detection and error correction. The two-layer semantic knowledge base includes (1) knowledge base of word collocations containing structured information of sentences extracted from a large-scale corpus; (2) knowledge base of sememe collocations obtained by sememe mapping through HowNet. On this basis, cubic association ratio and degree of polymerization are introduced to evaluate the proofreading results to reduce false positives and improve the accuracy of error correction opinions. The experiment result shows that our method will be of great use for the construction of semantic proofreading knowledge base and semantic error automatic proofreading methods.

Full Text