Learning Social Image Embedding with Deep Multimodal Attention Networks

Feiran Huang,Xiaoming Zhang,Yueying He,Tao Mei,Zhonghua Zhao,Zhoujun Li

doi:10.1145/3126686.3126720

Abstract

Learning social media data embedding by deep models has attracted extensive research interest as well as boomed a lot of applications, such as link prediction, classification, and cross-modal search. However, for social images which contain both link information and multimodal contents (e.g., text description, and visual content), simply employing the embedding learnt from network structure or data content results in sub-optimal social image representation. In this paper, we propose a novel social image embedding approach called Deep Multimodal Attention Networks (DMAN), which employs a deep model to jointly embed multimodal contents and link information. Specifically, to effectively capture the correlations between multimodal contents, we propose a multimodal attention network to encode the fine-granularity relation between image regions and textual words. To leverage the network structure for embedding learning, a novel Siamese-Triplet neural network is proposed to model the links among images. With the joint deep model, the learnt embedding can capture both the multimodal contents and the nonlinear network information. Extensive experiments are conducted to investigate the effectiveness of our approach in the applications of multi-label classification and cross-modal search. Compared to state-of-the-art image embeddings, our proposed DMAN achieves significant improvement in the tasks of multi-label classification and cross-modal search.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Social Image Embedding with Deep Multimodal Attention Networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

From content to links: Social image embedding with deep multimodal model
Feiran Huang ... Yueying He
Knowledge-Based Systems | VOL. 160
Feiran Huang, et. al.Feiran Huang ... Yueying He
18 Jul 2018
Knowledge-Based Systems | VOL. 160

Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder
Feiran Huang ... Yueying He
-
Feiran Huang, et. al.Feiran Huang ... Yueying He
05 Jun 2018
05 Jun 2018

Deep Attentive Multimodal Network Representation Learning for Social Media Images
Feiran Huang ... Sattam Alotaibi
ACM Transactions on Internet Technology | VOL. 21
Feiran Huang, et. al.Feiran Huang ... Sattam Alotaibi
16 Jun 2021
ACM Transactions on Internet Technology | VOL. 21

Deep multi-view representation learning for social images
Feiran Huang ... Yueying He
Applied Soft Computing | VOL. 73
Feiran Huang, et. al.Feiran Huang ... Yueying He
30 Aug 2018
Applied Soft Computing | VOL. 73

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Social Image Embedding with Deep Multimodal Attention Networks

Abstract

Talk to us

Similar Papers