Advances in Chinese Natural Language Processing and Language resources

Jianhua Tao,Fang Zheng Fang Zheng,Aijun Li,Ya Li Ya Li

doi:10.1109/icsda.2009.5278384

Abstract

In the past few years, there have been a significant number of activities in the area of Chinese Natural Language Processing (CNLP) including the language resource construction and assessment. This paper summarized the major tasks and key technologies in Natural Language Processing (NLP), which encompasses both text processing and speech processing by extension. The Chinese Language resources, including linguistic data, speech data, evaluation data and language toolkits which are elaborately constructed for CNLP related fields and some language resource consortiums are also introduced in this paper. Aimed to promote the development of corpus-based technologies, many resource consortiums commit themselves to collect, create and distribute many kinds of resources. The goal of these organizations is to set up a universal and well accepted Chinese resources database so that to push forward the CNLP.

Full Text