As humans, we have the capacity to refer to the things in the world around us. In everyday spoken communication, we often use words to describe intended referents (such as objects, people, and events), and our bodies (e.g., eyes, head, and hands) to indicate the location to which our addressee should focus her attention in order to further identify what we are talking about (Buhler, 1934; Clark and Bangerter, 2004). Traditionally, referring has been described as an autonomous and addressee-blind act that speakers do on their own without taking into account beliefs about their addressees' knowledge about a referent (e.g., Olson, 1970; see Clark and Bangerter, 2004). In contrast, more recent views consider it rather a collaborative enterprise that requires that speaker and addressee work together, for instance in reaching mutual agreement on how to conceptualize and name a particular entity (e.g., Clark and Wilkes-Gibbs, 1986; Brennan and Clark, 1996; Clark and Bangerter, 2004). Such agreement is established through interaction, and the addressee is at least as important as the speaker in reaching agreement and establishing reference. In prototypical instances of successful referring, speakers often produce spatial demonstratives like this and that to establish joint attention between speaker and addressee to a visible entity (Buhler, 1934; Levinson, 1983). Such demonstratives are among the most frequently used words in language, among the first words infants produce (Clark and Sengul, 1978), and possibly primordial in phylogeny (Diessel, 2006; Tomasello, 2008). Surprisingly, despite the advances made toward a social, collaborative account of referring more generally, the prevailing theoretical view on spatial demonstratives has remained deeply individual and egocentric, as illustrated by the following claims: “[T]he anchoring point of deictic expressions is egocentric (or, better, speaker-centric). Adult speakers skillfully relate what they are talking about to this me-here-now” (Levelt, 1989, p. 46). Spatial demonstratives “indicate the relative distance of an object, location, or person vis-a-vis the deictic center (…), which is usually associated with the location of the speaker” (Diessel, 1999, p. 36). “[D]emonstratives are interpreted based on the speaker's body” ((Diessel, 2014), p. 122). This egocentric account is intuitively appealing and still influential (e.g., Diessel, 2014; Stevens and Zhang, 2014). In the current paper, we question this account from both the production and the comprehension side, and discuss recent accumulating observational, experimental, and neuroscientific evidence that suggests an alternative social and multimodal view of demonstrative reference.
Read full abstract