NB 모델을 이용한 형태소 복원

Jae-Hoon Kim,Kil-Ho Jeon

doi:10.3745/kipstb.2012.19b.3.195

Abstract

한국어는 교착어이어서 형태소 분석 없이 품사 부착이 어려울 뿐 아니라 형태소를 분석할 때 다양한 어형 변화가 복원되어야 한다. 이것은 한국어 형태소 분석의 고질적인 문제 중 하나이며, 주로 규칙을 이용해서 해결한다. 규칙을 이용할 경우 주어진 문맥에 가장 적합한 복원을 어려워 여러 형태의 모호성을 생성하며, 이는 품사 부착에 의해서 해결된다. 본 논문에서는 이 문제를 기계학습 방법(Na<TEX>$\ddot{i}$</TEX>ve Bayes 모델)을 이용하여 해결한다. 기계학습 모델의 입력 자질은 어형 변화가 발생하는 주변 음절이며 출력 범주는 복원된 음절이다. ETRI 구문 말뭉치를 이용한 실험에서 제안된 형태소 복원 모델을 사용한 형태소 단위의 품사 부착 성능은 97.5%의 <TEX>$F_1$</TEX>점수를 보였으며 이 모델이 형태소 복원에 매우 유용함을 알 수 있었다. In Korean, spelling change in various forms must be recovered into base forms in morphological analysis as well as part-of-speech (POS) tagging is difficult without morphological analysis because Korean is agglutinative. This is one of notorious problems in Korean morphological analysis and has been solved by morpheme recovery rules, which generate morphological ambiguity resolved by POS tagging. In this paper, we propose a morpheme recovery scheme based on machine learning methods like Na<TEX>$\ddot{i}$</TEX>ve Bayes models. Input features of the models are the surrounding context of the syllable which the spelling change is occurred and categories of the models are the recovered syllables. The POS tagging system with the proposed model has demonstrated the <TEX>$F_1$</TEX>-score of 97.5% for the ETRI tree-tagged corpus. Thus it can be decided that the proposed model is very useful to handle morpheme recovery in Korean.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

NB 모델을 이용한 형태소 복원

Abstract

Talk to us

Similar Papers

More From: The KIPS Transactions:PartB

Lead the way for us

Journal: The KIPS Transactions:PartB	Publication Date: Jun 30, 2012
Citations: 6

Similar Papers

The Development of Indonesian POS Tagging System for Computer-aided Independent Language Learning
Muljono Muljono ... Umriya Afini
International Journal of Emerging Technologies in Learning (iJET) | VOL. 12
Muljono Muljono, et. al.Muljono Muljono ... Umriya Afini
16 Nov 2017
International Journal of Emerging Technologies in Learning (iJET) | VOL. 12

Improving Persian POS tagging using the maximum entropy model
Ahmad A Kardan ... Maryam Bahojb Imani
-
Ahmad A Kardan, et. al.Ahmad A Kardan ... Maryam Bahojb Imani
01 Feb 2014
01 Feb 2014

Implementation of Kadazan Tagger Based on Brill's Method
Marylyn Alex ... Lailatul Qadri Zakaria
Journal of ICT Research and Applications | VOL. 7
Marylyn Alex, et. al.Marylyn Alex ... Lailatul Qadri Zakaria
01 Dec 2013
Journal of ICT Research and Applications | VOL. 7

Investigating Part-of-Speech Tagging in Khasi Using Naïve Bayes and Support Vector Machine
Sunita Warjri ... Arnab Kumar Maji
-
Sunita Warjri, et. al.Sunita Warjri ... Arnab Kumar Maji
08 Nov 2022
08 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

NB 모델을 이용한 형태소 복원

Abstract

Talk to us

Similar Papers

More From: The KIPS Transactions:PartB