Learning Multimodal Word Representations by Explicitly Embedding Syntactic and Phonetic Information

Wenhao Zhu,Shuang Liu,Xiaoya Yin,Xiaping Xv,Chaoming Liu

doi:10.1109/access.2020.3042183

Wenhao Zhu, Shuang Liu + Show 3 more

Open Access

PDF Available

https://doi.org/10.1109/access.2020.3042183

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 2	License type: CC BY 4.0

Affiliation: Shanghai University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Word embedding (i.e., word representation) transforms words into computable mathematical expressions (usually vectors) according to semantics. Compared with human semantic representation, these purely text-based models are severely deficient because they lack perceptual information attached to the physical world. This observation promotes the development of multimodal word representation models. Multimodal models have been proven to outperform text-based models on learning semantic word representations, and almost all previous multimodal models only focus on introducing perceptual information. However, it is obvious that syntactic information can effectively improve the performance of multimodal models on downstream tasks. Therefore, this article proposes an effective multimodal word representation model that uses two gate mechanisms to explicitly embed syntactic and phonetic information into multimodal representations and uses supervised learning to train the model. We select Chinese and English as examples and evaluate the model using several downstream tasks. The results show that our approach outperforms the existing models. We have made the source code of the model available to encourage reproducible research.

Highlights

Word embedding is often used in natural language processing (NLP) tasks such as machine translation [59], text classification [1], and dialogue systems [50]
Compared to human semantic representation, these purely text-based models are severely deficient because they lack perceptual information attached to the physical world
Combining the results of other intrinsic evaluation tasks, it can be concluded that the word representation generated by the MSP model contain more semantic and syntactic information, and that such information can be used in relevant downstream tasks

Summary

INTRODUCTION

Word embedding is often used in natural language processing (NLP) tasks such as machine translation [59], text classification [1], and dialogue systems [50]. It is more difficult to obtain syntactic information through the distributional hypothesis These factors inspire us to build a multimodal word representation model that can embed syntactic and perceptual information effectively, and the model is called MSP. Compared with the existing word embedding models, MSP explicitly embeds syntactic and phonetic information in the model, simulates multimodal information fusion through two gate mechanisms, and obtains a multimodal word representation model with excellent performance through supervised training. On various NLP tasks, we use multiple word representation models and pre-trained language models as baselines to compare the performance and set MSP- with no processing of syntactic information as a control.

RELATED WORKS

PROPOSED METHOD

TASK EVALUATION

CONCEPT CATEGORIZATION TASK

2) RESULTS AND DISCUSSION

WORD SIMILARITY TASK

WORD ANALOGY TASK

PART-OF-SPEECH TAGGING TASK

TEXT CLASSIFICATION TASK

TEXT SIMILARITY TASK

Conclusions

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Learning Multimodal Word Representations by Explicitly Embedding Syntactic and Phonetic Information

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Incorporating Syntactic and Phonetic Information into Multimodal Word Embeddings Using Graph Convolutional Networks
Wenhao Zhu ... Shuang Liu
-
Wenhao Zhu, et. al.Wenhao Zhu ... Shuang Liu
06 Jun 2021
06 Jun 2021

Learning multimodal word representation with graph convolutional networks
Wenhao Zhu ... Chaoming Liu
Information Processing & Management | VOL. 58
Wenhao Zhu, et. al.Wenhao Zhu ... Chaoming Liu
12 Aug 2021
Information Processing & Management | VOL. 58

Learning Multimodal Word Representation via Dynamic Fusion Methods
Shaonan Wang ... Chengqing Zong
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 32
Shaonan Wang, et. al.Shaonan Wang ... Chengqing Zong
26 Apr 2018
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 32

Investigating Inner Properties of Multimodal Representation and Semantic Compositionality With Brain-Based Componential Semantics
Shaonan Wang ... Chengqing Zong
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 32
Shaonan Wang, et. al.Shaonan Wang ... Chengqing Zong
26 Apr 2018
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Learning Multimodal Word Representations by Explicitly Embedding Syntactic and Phonetic Information

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: IEEE Access