Grounded Semantic Composition for Visual Scenes

P Gorniak,D Roy

doi:10.1613/jair.1327

Grounded Semantic Composition for Visual Scenes

P Gorniak, D Roy

Open Access

https://doi.org/10.1613/jair.1327

Copy DOI

Journal: Journal of Artificial Intelligence Research	Publication Date: Apr 1, 2004
Citations: 172	License type: publisher-specific-oa

#Objects In Scenes #System's Successes + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system's successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Journal of Artificial Intelligence Research

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.