A Local Generative Model for Chinese Word Segmentation

Kaixu Zhang,Maosong Sun,Ping Xue

doi:10.1007/978-3-642-17187-1_41

Abstract

This paper presents a local generative model for Chinese word segmentation, which has faster learning process than discriminative models and can do unsupervised learning. It has the ability to make use of larger resources. In this model, four successive characters are used to determine whether a character interval should be a word boundary or not. The Gibbs sampling algorithm, as well as three additional rules, is applied for the unsupervised learning. Besides words, the word candidates that are generated by our model can improve the performance of Chinese information retrieval. The experiments show that in supervised learning our method outperforms a language model based method. And the performance on one corpus is better than the best one reported in SIGHAN bakeoff 05. In unsupervised learning, our method achieves the comparable performance compared to the state-of-the-art method.Keywordsprobability modelnatural language processingChinese word segmentation

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Local Generative Model for Chinese Word Segmentation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Construction of Word Segmentation Model Based on HMM + BI-LSTM
Hang Zhang ... Bin Wen
-
Hang Zhang, et. al.Hang Zhang ... Bin Wen
01 Jan 2020
01 Jan 2020

Discriminative pruning of language models for Chinese word segmentation
Jianfeng Li ... Haifeng Wang
-
Jianfeng Li, et. al.Jianfeng Li ... Haifeng Wang
01 Jan 2006
01 Jan 2006

A GPU-Based Accelerator for Chinese Word Segmentation
Xiwu Gu ... Weijun Xiao
-
Xiwu Gu, et. al.Xiwu Gu ... Weijun Xiao
01 Jan 2012
01 Jan 2012

An effective joint model for chinese word segmentation and POS tagging
Heng-Jun Wang ... Nian-Wen Si
-
Heng-Jun Wang, et. al.Heng-Jun Wang ... Nian-Wen Si
23 Dec 2016
23 Dec 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Local Generative Model for Chinese Word Segmentation

Abstract

Talk to us

Similar Papers