Abstract

In image-text presentations from online discourse, pronouns can refer to entities depicted in images, even if these entities are not otherwise referred to in a text caption. While visual salience may be enough to allow a writer to use a pronoun to refer to a prominent entity in the image, coherence theory suggests that pronoun use is more restricted. Specifically, language users may need an appropriate coherence relation between text and imagery to license and resolve pronouns. To explore this hypothesis and better understand the relationship between image context and text interpretation, we annotated an image-text data set with coherence relations and pronoun information. We find that pronoun use reflects a complex interaction between the content of the pronoun, the grammar of the text, and the relation of text and image.

Highlights

  • Image-text presentations are widely available on the internet, in captioned images, social media posts, and web pages

  • We found that pronoun use depends on the kind of relation between the image and its caption

  • We saw that there is overall a high frequency of Visible coherence relations, and the most frequently, indexical and indexical/bound personal pronouns occurred in captions, followed by anaphoric pronouns

Read more

Summary

Introduction

Image-text presentations are widely available on the internet, in captioned images, social media posts, and web pages. These image-text presentations provide a valuable proxy for situated language, enabling indirect inferences about face-to-face conversation, the primary setting for language learning and language use. One fundamental difference is the semantic relationship between text and imagery: the model caption summarizes the image while the puppy caption links the image content to further facts about the speaker. These various relations lead to different ways in which we can identify objects in imagery through the use of a caption. A key case concerns the use of pronouns, which, in image-text presentations such as in the puppy image-caption example above, can refer deictically to entities from the image

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.