Learning Multi-Modal Word Representation Grounded in Visual Context

Éloi Zablocki,Benjamin Piwowarski,Laure Soulier,Patrick Gallinari

doi:10.1609/aaai.v32i1.11939

Learning Multi-Modal Word Representation Grounded in Visual Context

Éloi Zablocki, Benjamin Piwowarski + Show 2 more

Open Access

https://doi.org/10.1609/aaai.v32i1.11939

Copy DOI

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Apr 27, 2018
Citations: 20

Affiliation: Laboratoire de Recherche en Informatique de Paris 6, Sorbonne Université, French National Centre for Scientific Research, Délégation Ile-de-France Villejuif

#Visual Context #Textual Context + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Representing the semantics of words is a long-standing problem for the natural language processing community. Most methods compute word semantics given their textual context in large corpora. More recently, researchers attempted to integrate perceptual and visual features. Most of these works consider the visual appearance of objects to enhance word representations but they ignore the visual environment and context in which objects appear. We propose to unify text-based techniques with vision-based techniques by simultaneously leveraging textual and visual context to learn multimodal word embeddings. We explore various choices for what can serve as a visual context and present an end-to-end method to integrate visual context elements in a multimodal skip-gram model. We provide experiments and extensive analysis of the obtained results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.