Completion and parsing Chinese sentences using cogent confabulation

Zhe Li,Qinru Qiu

doi:10.1109/ccmb.2014.7020691

Abstract

Among different languages' sentence completion and parsing, Chinese is of great difficulty. Chinese words are not naturally separated by delimiters, which imposes extra challenge. Cogent confabulation based sentence completion has been proposed for English. It fills in missing words in an English sentence while maintains the semantic and syntactic consistency. In this work, we improve the cogent confabulation model and apply it to sentence completion in Chinese. Incorporating trained knowledge in parts-of-speech tagging and Chinese word compound segmentation, the model does not only fill missing words in a sentence but also performs linguistic analysis of the sentence with a high accuracy. We further investigate the optimization of the model and trade-offs between accuracy and training/recall complexity. Experimental results show that the optimized model improves recall accuracy by 9% and reduces training and recall time by 18.6% and 53.7% respectively.

Full Text