Text to image generative model using constrained embedding space mapping

Subhajit Chaudhury,Md A Salam Khan,Sakyasingha Dasgupta,Asim Munawar,Ryuki Tachibana

doi:10.1109/mlsp.2017.8168111

Abstract

We present a conditional generative method that maps low-dimensional embeddings of image and natural language to a common latent space hence extracting semantic relationships between them. The embedding specific to a modality is first extracted and subsequently a constrained optimization procedure is performed to project the two embedding spaces to a common manifold. Based on this, we present a method to learn the conditional probability distribution of the two embedding spaces; first, by mapping them to a shared latent space and generating back the individual embeddings from this common space. However, in order to enable independent conditional inference for separately extracting the corresponding embeddings from the common latent space representation, we deploy a proxy variable trick — wherein, the single shared latent space is replaced by two separate latent spaces. We design an objective function, such that, during training we can force these separate spaces to lie close to each other, by minimizing the Euclidean distance between their distribution functions. Experimental results demonstrate that the learned joint model can generalize to learning concepts of double MNIST digits with additional attributes of colors, thereby enabling the generation of specific colored images from the respective text data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Text to image generative model using constrained embedding space mapping

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation.
Bang Yang ... David A Clifton
IEEE transactions on pattern analysis and machine intelligence | VOL. 46
Bang Yang, et. al.Bang Yang ... David A Clifton
01 Aug 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. 46

Interlayer Link Prediction in Multiplex Social Networks Based on Multiple Types of Consistency Between Embedding Vectors.
Rui Tang ... Shuyu Jiang
IEEE Transactions on Cybernetics | VOL. 53
Rui Tang, et. al.Rui Tang ... Shuyu Jiang
01 Apr 2023
IEEE Transactions on Cybernetics | VOL. 53

Unsupervised Deformable Registration for Multi-modal Images via Disentangled Representations
Chen Qin ... Tommaso Mansi
-
Chen Qin, et. al.Chen Qin ... Tommaso Mansi
01 Jan 2019
01 Jan 2019

Common Latent Embedding Space for Cross-Domain Facial Expression Recognition
Run Wang ... Peng Song
IEEE Transactions on Computational Social Systems | VOL. 11
Run Wang, et. al.Run Wang ... Peng Song
01 Apr 2024
IEEE Transactions on Computational Social Systems | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text to image generative model using constrained embedding space mapping

Abstract

Talk to us

Similar Papers