Effective method for making Chinese word vector dynamic

Boting Liu,Zhijie Fang,Weili Guan,Changjin Yang

doi:10.3233/jifs-224052

Abstract

Word vector is an important tool for natural language processing (NLP) tasks such as text classification. However, existing static language models such as Word2vec cannot solve the polysemy problem, leading to a decline in text classification performance. To solve this problem, this paper proposes a method for making Chinese word vector dynamic (MCWVD). The part of speech (POS) is used to solve the ambiguity problem caused by different POS. The POS structure graph is constructed and the syntactic structure information of POS features is extracted by GCN (Graph Convolutional Network). POS vector and word vector are concatenated into PW (POS-Word) vector. Parametric matrix is added to improve the fusion effect of POS and word features. Multilayer attention is used to distinguish the importance of different features and further update the vector expression of word vectors about the current context. Experiments on Chinese datasets THUCNews and SogouNews show that MCWVD effectively improves the accuracy of text classification and achieves better performance than CoVe (Context Vectors) and ELMo (Embeddings from Language Models). MCWVD also achieves similar performance to BERT and GPT-1 (Generative Pre-Training), but with a much lower computational cost and only 4% of BERT parameters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Effective method for making Chinese word vector dynamic

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent & Fuzzy Systems

Lead the way for us

Journal: Journal of Intelligent & Fuzzy Systems	Publication Date: Jul 2, 2023
Citations: 1

Similar Papers

Natural Language Processing and the Promise of Big Data: Small Step Forward, but Many Miles to Go.
Thomas M Maddox ... Michael A Matheny
Circulation. Cardiovascular quality and outcomes | VOL. 8
Thomas M Maddox, et. al.Thomas M Maddox ... Michael A Matheny
18 Aug 2015
Circulation. Cardiovascular quality and outcomes | VOL. 8

Part of speech tagging: a systematic review of deep learning and machine learning approaches
Alebachew Chiche ... Betselot Yitagesu
Journal of Big Data | VOL. 9
Alebachew Chiche, et. al.Alebachew Chiche ... Betselot Yitagesu
24 Jan 2022
Journal of Big Data | VOL. 9

Shahmukhi named entity recognition by using contextualized word embeddings
Amina Tehseen ... Amjad Ali
Expert Systems with Applications | VOL. 229
Amina Tehseen, et. al.Amina Tehseen ... Amjad Ali
01 Nov 2023
Expert Systems with Applications | VOL. 229

Word Embeddings for Natural Language Processing

-

01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Effective method for making Chinese word vector dynamic

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent &amp; Fuzzy Systems

More From: Journal of Intelligent & Fuzzy Systems