From content to links: Social image embedding with deep multimodal model

Feiran Huang,Xiaoming Zhang,Zhoujun Li,Zhonghua Zhao,Yueying He

doi:10.1016/j.knosys.2018.07.020

Abstract

With the popularity of social network, social media data embedding has attracted extensive research interest and boomed many applications, such as image classification and cross-modal retrieval. In this paper, we examine the scenario of social images containing multimodal content (e.g., visual content and textual tags) and connecting with each other (e.g., two images submitted to the same group). In such a case, both the multimodal content and link information provide useful clues for representation learning. Therefore, simply learning the embedding from network structure or data content results in sub-optimal social image representation. In this paper, we propose a Deep Multimodal Attention Networks (DMAN) to combine multimodal content and link information for social image embedding. Specifically, to effectively incorporate the multimodal content, a visual-textual attention model is proposed to encode the fine-granularity correlation between multimodal content, i.e., the alignment between image regions and textual words. To incorporate the network structure for embedding learning, a novel Siamese-Triplet neural network is proposed to model the first-order proximity and the second-order proximity among images. Then the two modules are integrated into a joint deep model for social image embedding. Once the representation has been learned, a wide variety of data mining problems can be solved by using the task-specific algorithms designed for handling vector representations. Extensive experiments are conducted to demonstrate the effectiveness of our approach on multi-label classification and cross-modal search.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

From content to links: Social image embedding with deep multimodal model

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems

Lead the way for us

Journal: Knowledge-Based Systems	Publication Date: Jul 18, 2018
Citations: 13

Similar Papers

Learning Social Image Embedding with Deep Multimodal Attention Networks
Feiran Huang ... Zhoujun Li
-
Feiran Huang, et. al.Feiran Huang ... Zhoujun Li
23 Oct 2017
23 Oct 2017

Multimodal Learning of Social Image Representation by Exploiting Social Relations.
Feiran Huang ... Zhonghua Zhao
IEEE Transactions on Cybernetics | VOL. 51
Feiran Huang, et. al.Feiran Huang ... Zhonghua Zhao
17 Feb 2021
IEEE Transactions on Cybernetics | VOL. 51

Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder
Feiran Huang ... Yueying He
-
Feiran Huang, et. al.Feiran Huang ... Yueying He
05 Jun 2018
05 Jun 2018

Deep multi-view representation learning for social images
Feiran Huang ... Yueying He
Applied Soft Computing | VOL. 73
Feiran Huang, et. al.Feiran Huang ... Yueying He
30 Aug 2018
Applied Soft Computing | VOL. 73

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

From content to links: Social image embedding with deep multimodal model

Abstract

Talk to us

Similar Papers

More From: Knowledge-Based Systems