Deep multi-view representation learning for social images

Feiran Huang,Xiaoming Zhang,Zhonghua Zhao,Zhoujun Li,Yueying He

doi:10.1016/j.asoc.2018.08.010

Abstract

Multi-view representation learning for social images has recently made remarkable achievements in many tasks, such as cross-view classification and cross-modal retrieval. Since social images usually contain link information besides the multi-modal contents (e.g., text description, and visual content), simply employing the data content may result in sub-optimal multi-view representation of the social images. In this paper, we propose a Deep Multi-View Embedding Model (DMVEM) to learn joint embeddings for the three views including the visual content, the associated text descriptions, and their relations. To effectively encode the link information, a weighted relation network is built based on the linkages between social images, which is then embedded into a low dimensional vector space using the Skip-Gram model. The learned vector is regarded as the third view besides the visual content and text description. To learn a joint representation from the three views, a deep learning model with three-branch nonlinear neural network is proposed. A three-view bi-directional loss function is used to capture the correlation between the three views. The stacked autoencoder is adopted to preserve the self-structure and reconstructability of the learned representation for each view. Comprehensive experiments are conducted in the tasks of image-to-text, text-to-image, and image-to-image searches. Compared to the state-of-the-art multi-view embedding methods, our approach achieves significant improvement of performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deep multi-view representation learning for social images

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing

Lead the way for us

Journal: Applied Soft Computing	Publication Date: Aug 30, 2018
Citations: 15

Similar Papers

Chapter 6 - Multimodal learning of social image representation
Feiran Huang ... Weichang Huang
Digital Image Enhancement and Reconstruction | VOL. -
Feiran Huang, et. al.Feiran Huang ... Weichang Huang
01 Jan 2023
Digital Image Enhancement and Reconstruction | VOL. -

Multimodal Network Embedding via Attention based Multi-view Variational Autoencoder
Feiran Huang ... Yueying He
-
Feiran Huang, et. al.Feiran Huang ... Yueying He
05 Jun 2018
05 Jun 2018

Deep Attentive Multimodal Network Representation Learning for Social Media Images
Feiran Huang ... Sattam Alotaibi
ACM Transactions on Internet Technology | VOL. 21
Feiran Huang, et. al.Feiran Huang ... Sattam Alotaibi
16 Jun 2021
ACM Transactions on Internet Technology | VOL. 21

Multimodal Learning of Social Image Representation by Exploiting Social Relations.
Feiran Huang ... Zhonghua Zhao
IEEE Transactions on Cybernetics | VOL. 51
Feiran Huang, et. al.Feiran Huang ... Zhonghua Zhao
17 Feb 2021
IEEE Transactions on Cybernetics | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep multi-view representation learning for social images

Abstract

Talk to us

Similar Papers

More From: Applied Soft Computing