ABSTRACT - We used eye tracking to examine the relative influence of the mouth and eyes on perception of sung interval size. Frequency and duration of gaze were tracked while participants rated the size of intervals produced by two singers in three signal-to-noise conditions, corresponding to high, medium and low audibility. AM intervals ascended in pitch direction and ranged in size from O to 12 semitones. Both the frequency and duration of gaze fixations revealed that the mouth was the most salient aspect of the visual channel. However, gaze was diverted away from the mouth and toward the eyes with increasing audibility, interval size, and tonal consonance of intervals. A linear regression model incorporating all of these variables accounted for 50% of the variability in gaze duration for the mouth and 45% of the variability in gaze duration for the eyes. Results are discussed in the context of dynamic allocation of attentional resources on the basis of early registration of sensory input. This is the first study of singing to incorporate eye-tracking methodology. KEYWORDS - eye tracking, interval size, cross-modal, visual influences, singing Although cognitive science tends to approach music from the perspective of the auditory modality, a number of empirical studies published over the past decade have demonstrated the important role that the visual modality has in shaping our experience of music. Visual aspects of music performance can influence perception of emotion (Dahl & Friberg, 2004; Davidson & Correia, 2002; Thompson, Graham, & Russo, 2005), physiological response (Chapados & Levitin, 2008), perceived tension (Vines, Krumhansl, Wanderley, & Levitin, 2006), and even structural characteristics of music such as perceived note duration (Schutz & Lipscomb, 2007) and sung interval size (Thompson, Russo, & Livingstone, 2010). The importance of the visual modality in the perception of singing may be owed to a number of factors. These include the natural entwinement of visual and auditory dynamics over the course of song (Thompson et al., 2005), specialized neural circuitry shaped by extensive experience with audio-visual speech (Dick, Solodkin, & Small, 2010), and multimodal mechanisms sub-serving communication that may pre-date song as well as speech (Mithen, 2005, p. 138). An obvious way in which visual information exerts an influence in perception of singing is in the realm of emotion (Di Carlo, 2004; Thompson et al., 2005). For example, an audio-visual recording of a minor third can be made to convey more happiness if the visual recording is substituted with that of a major third (Thompson, Russo, & Quinto, 2008). The availability of visual information in song is also known to influence perception of phonemes (Quinto, Thompson, Russo, & Trehub, 2010) and comprehension of sung lyrics (Hidalgo-Barnes & Massaro, 2007; Jesse & Massaro, 2010). One of the more surprising influences of visual information in song is on perception of interval size. An audio-visual recording of a large interval can be made to sound smaller if the visual recording is replaced with that of a smaller interval (Thompson et al., 2010). Remarkably, the effect of visual information persists even when listeners are (a) asked to focus on auditory information alone and (b) encumbered with a demanding secondary task, suggesting that the visual influence relies upon automatic and pre-attentive mechanisms. One implication of these findings is that gaze behavior responds in a dynamic manner to changes in the availability of auditory and visual information. Thompson and Russo (2007) investigated the utility of the visual modality for making judgments of interval size by presenting participants with silent videos of singers who sang ascending intervals and by asking participants to rate the size of each interval. The near-perfect correlation observed between rated interval size and veridical interval size implies that observers are able to discriminate intervals on the basis of visual information alone. …
Read full abstract