한국어 형태소 분석을 위한 음절 단위 확률 모델

Kwangseob Shim

doi:10.5626/jok.2014.41.9.642

Abstract

This paper proposes three probabilistic models for syllable-based Korean morphological analysis, and presents the performance of proposed probabilistic models. Probabilities for the models are acquired from POS-tagged corpus. The result of 10-fold cross-validation experiments shows that 98.3% answer inclusion rate is achieved when trained with Sejong POS-tagged corpus of 10 million eojeols. In our models, POS tags are assigned to each syllable before spelling recovery and morpheme generation, which enables more efficient morphological analysis than the previous probabilistic models where spelling recovery is performed at the first stage. This efficiency gains the speed-up of morphological analysis. Experiments show that morphological analysis is performed at the rate of 147K eojeols per second, which is almost 174 times faster than the previous probabilistic models for Korean morphology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

한국어 형태소 분석을 위한 음절 단위 확률 모델

Abstract

Talk to us

Similar Papers

More From: Journal of KIISE

Lead the way for us

Journal: Journal of KIISE	Publication Date: Sep 15, 2014
Citations: 3

Similar Papers

The Development of Indonesian POS Tagging System for Computer-aided Independent Language Learning
Muljono Muljono ... Catur Supriyanto
International Journal of Emerging Technologies in Learning (iJET) | VOL. 12
Muljono Muljono, et. al.Muljono Muljono ... Catur Supriyanto
16 Nov 2017
International Journal of Emerging Technologies in Learning (iJET) | VOL. 12

A Study on the Importance of Linguistic Suffixes in Maithili POS Tagger Development
Ankur Priyadarshi ... Sujan Kumar Saha
-
Ankur Priyadarshi, et. al.Ankur Priyadarshi ... Sujan Kumar Saha
01 Jan 2020
01 Jan 2020

Korean Part-of-speech Tagging Based on Morpheme Generation
Hyun-Je Song ... Seong-Bae Park
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 19
Hyun-Je Song, et. al.Hyun-Je Song ... Seong-Bae Park
09 Jan 2020
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 19

Part of Speech Tagger for Marathi Language
Sharvari Govilkar ... Bakal J. W
International Journal of Computer Applications | VOL. 119
Sharvari Govilkar, et. al.Sharvari Govilkar ... Bakal J. W
18 Jun 2015
International Journal of Computer Applications | VOL. 119

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

한국어 형태소 분석을 위한 음절 단위 확률 모델

Abstract

Talk to us

Similar Papers

More From: Journal of KIISE