Abstract

We present a model of pragmatic referring expression interpretation in a grounded communication task (identifying colors from descriptions) that draws upon predictions from two recurrent neural network classifiers, a speaker and a listener, unified by a recursive pragmatic reasoning framework. Experiments show that this combined pragmatic model interprets color descriptions more accurately than the classifiers from which it is built, and that much of this improvement results from combining the speaker and listener perspectives. We observe that pragmatic reasoning helps primarily in the hardest cases: when the model must distinguish very similar colors, or when few utterances adequately express the target color. Our findings make use of a newly-collected corpus of human utterances in color reference games, which exhibit a variety of pragmatic behaviors. We also show that the embedded speaker model reproduces many of these pragmatic behaviors.

Highlights

  • Experiments on the data in our corpus show that this combined pragmatic model improves accuracy in interpreting human-produced descriptions over the basic recurrent neural networks (RNNs) listener alone

  • For ease of comparison to computational results, we focus on five metrics capturing different aspects of pragmatic behavior displayed by both human and artificial speakers in our task (Table 2)

  • Listener-based listener The starting point of Rational Speech Acts (RSA) is a model of a literal listener: l0(t | u, L) ∝ L(u, t)P (t) where t is a color in the context set C, u is a message drawn from a set of possible utterances U, P is a prior over colors, and L(u, t) is a semantic interpretation function that takes the value 1 if u is true of t, else 0

Read more

Summary

Introduction

Our most successful model integrates speaker and listener perspectives, combining predictions made by a system trained to understand color descriptions and one trained to produce them. We evaluate this model with a new, psycholinguistically motivated corpus of real-time, dyadic reference games in which the referents are patches of color. Experiments on the data in our corpus show that this combined pragmatic model improves accuracy in interpreting human-produced descriptions over the basic RNN listener alone. Pragmatic reasoning on top of the listener RNN alone yields improvements, which come primarily in the hardest cases: 1) contexts with colors that are very similar, requiring the interpretation of descriptions that convey fine distinctions; and 2) target colors that most referring expressions fail to identify, whether due to a lack of adequate descriptive terms or a consistent bias against the color in the RNN listener

Task and data collection
Behavioral results
Listener behavior
Speaker behavior
Models
Base listener
Base speaker
Pragmatic agents
Training
Listener accuracy
Model analysis
Related work
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.