A Character-Enhanced Chinese Word Embedding Model

Gang Yang,Zaishang Cai,Hongzhe Xu,Tianhao He

doi:10.1109/ijcnn.2019.8852270

Abstract

Distributed word representation has demonstrated its advantages in many natural language processing tasks. Such as named entity recognition, entity relation extraction, and text classification. Traditional one-hot word representation represents a word as a high-dimensional and sparse vector. Instead, distributed word representation represents a word as a low-dimensional and dense vector, which are more suitable as inputs of deep neural networks. Furthermore, distributed word representation can express the semantic relatedness and syntactic regularities between different words. Word embedding is a kind of distributed word representation technology, which is very popular and useful in many natural language processing tasks. Recently, more and more researches have focused on learning word embeddings with internal morphological knowledge in words, such as character, sub-words, and other kinds of morphological information. For example, Chinese characters contain rich semantic information related to words they compose. Thus, characters can help improving the representation of words. In this paper, we present a character-enhanced Chinese word embeddings model (CCWE). In the model, we train character and word embeddings simultaneously in two parallel tasks. The framework of our model is based-on Skip-Gram. We evaluate CCWE on word similarity, analogical reasoning, text classification, and named entity recognition. The results demonstrate that our model can learn both better Chinese word and character embeddings than other baseline models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Character-Enhanced Chinese Word Embedding Model

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Named entity recognition for Chinese telecommunications field based on Char2Vec and Bi-LSTMs
Yu Wang ... Zheng Liu
-
Yu Wang, et. al.Yu Wang ... Zheng Liu
01 Nov 2017
01 Nov 2017

Word Embeddings for Natural Language Processing

-

01 Jan 2015
01 Jan 2015

On Character vs Word Embeddings as Input for English Sentence Classification
James Hammerton ... Michele Sama
-
James Hammerton, et. al.James Hammerton ... Michele Sama
09 Nov 2018
09 Nov 2018

Shahmukhi named entity recognition by using contextualized word embeddings
Amina Tehseen ... Xiangjie Kong
Expert Systems with Applications | VOL. 229
Amina Tehseen, et. al.Amina Tehseen ... Xiangjie Kong
01 Nov 2023
Expert Systems with Applications | VOL. 229

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Character-Enhanced Chinese Word Embedding Model

Abstract

Talk to us

Similar Papers