현대중국어 형태소 분석기의 현황과 활용

Minjun Park

doi:10.38068/kjcl.102.13

Abstract

This paper introduces and explains in detail the overall information and tutorial about the commonly used Chinese morphological analyzers (e.g. ICTCLAS, Jieba, Stanford CoreNLP) which are employed in Chinese preprocessing tasks of Chinese Word Segmentation (CWS) and Part-of-speech tagging. In particular, the usability of the tools was enhanced by developing simple executables distributed to linguistic researchers unfamiliar with coding, along with rich execution examples in GUI and CLI environments. Plus, by introducing the unique features and functions of each morphological analyzer, it was recommended the most suitable analyzer tailored to the needs of individual researchers. As a guide for Chinese morphological analysis, which is inevitably accompanied by data-driven quantitative research, this study presents practical tools and useful guidelines for Chinese text preprocessing to researchers who want to expand their research interests to corpus linguistics, computational linguistics, and natural language processing.

Full Text