Chinese Named Entity Recognition Based on Character-Word Vector Fusion

Na Ye,Xin Qin,Lili Dong,Xiang Zhang,Kangkang Sun

doi:10.1155/2020/8866540

Abstract

Due to the lack of explicit markers in Chinese text to define the boundaries of words, it is often more difficult to identify named entities in Chinese than in English. At present, the pretreatment of the character or word vector models is adopted in the training of the Chinese named entity recognition model. Aimed at the problems that taking character vector as an input of the neural network cannot use the words’ semantic meanings and give up the words’ explicit boundary information, and taking the word vector as an input of the neural network relies on the accuracy of the segmentation algorithms, a Chinese named entity recognition model based on character word vector fusion CWVF-BiLSTM-CRF (Character Word Vector Fusion-Bidirectional Long-Short Term Memory Networks-Conditional Random Field) is proposed in this paper. First, the Word2Vec is used to obtain the corresponding dictionaries of character-character vector and word-word vector. Second, the character-word vector is integrated as the input unit of the BiLSTM (Bidirectional Long-Short Term Memory) network, and then, the problem of an unreasonable tag sequence is solved using the CRF (conditional random field). By using the presented model, the dependence on the accuracy of the word segmentation algorithm is reduced, and the words’ semantic characteristics are effectively applied. The experimental results show that the model based on character-word vector fusion improves the recognition effect of the Chinese named entity.

Highlights

In a broad sense, the purpose of named entity recognition (NER) is to recognize the named entity in the text and classify it into the corresponding entity types
(2) The character-word vector fusion is key to the Chinese named entity recognition, and we propose a way to process the vector by fusing the character vector and the word vector which the character is contained
In order to search for the optimal structure of the named entity recognition model, this experiment performs a tuning experiment on common parameters that affect the performance of the model

Summary

Introduction

The purpose of named entity recognition (NER) is to recognize the named entity in the text and classify it into the corresponding entity types. Lample et al [7] used BiLSTM to extract character-level features, which were fused with the word vectors in dictionaries to form the final input vector, and the BiLSTM and the CRF model were combined to do the named entity recognition, which has achieved good results in English, German, Spanish, and other testing corpus. Both the methods proposed by Ma and Hovy and Lample et al leveraged the word vector to do named entity recognition in foreign language corpus, during which the accuracy of word segmentation needed not to be considered, but the accuracy of word segmentation in Chinese corpus cannot be avoided.

CWVF-BiLSTM-CRF Model

Experimental Analysis

Experiment 1

Experiment 2

Experimental Results and Analysis

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Wireless Communications and Mobile Computing	Publication Date: Jul 4, 2020
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Chinese Named Entity Recognition Based on Character-Word Vector Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wireless Communications and Mobile Computing

Lead the way for us

Similar Papers

Gujarati Task Oriented Dialogue Slot Tagging Using Deep Neural Network Models
Rachana Parikh ... Hiren Joshi
-
Rachana Parikh, et. al.Rachana Parikh ... Hiren Joshi
01 Jan 2020
01 Jan 2020

An ensemble method to forecast 24-h ahead solar irradiance using wavelet decomposition and BiLSTM deep learning network.
Pardeep Singla ... Manoj Duhan
Earth Science Informatics | VOL. 15
Pardeep Singla, et. al.Pardeep Singla ... Manoj Duhan
17 Nov 2021
Earth Science Informatics | VOL. 15

Long short-term memory (LSTM)-based news classification model.
Chen Liu
PloS one | VOL. 19
Chen LiuChen Liu
01 Jan 2024
PloS one | VOL. 19

Automatic gear shift strategy for manual transmission of mine truck based on Bi-LSTM network
Liyong Wang ... Min Xie
Expert Systems With Applications | VOL. 209
Liyong Wang, et. al.Liyong Wang ... Min Xie
03 Aug 2022
Expert Systems With Applications | VOL. 209

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Chinese Named Entity Recognition Based on Character-Word Vector Fusion

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wireless Communications and Mobile Computing