Listeners' affective response to music has received extensive empirical and theoretical treatment that considers culturally specific factors such as tonal familiarity, and surface structures such as acoustic intensity (Balkwill & Thompson, 1999; Balkwill, Thompson, & Matsunaga, 2004; Dean, Bailes, & Schubert, 2011; Grewe, Nagel, Kopiez, & Altenmuller, 2007; Juslin, 2013; Juslin & Sloboda, 2001; Juslin & Vastfjall, 2008; Meyer, 1956; Olsen & Stevens, 2013). The primary framework used to investigate the link between music and affect is the two-dimensional circumplex model (Russell, 1980). In this framework, the dimension of is commonly characterized in terms of activation, with anchors such as active/aroused and passive/calm (Schubert, 2010). The second dimension of comprises positive and negative anchors and may be conceptualized as the of a stimulus. Other multidimensional models have also been proposed that further divide into energy arousal and tension arousal (e.g., Thayer, 1978, 1986).Russell's (1980) circumplex model is a robust framework apt for research on perceived affect in visual and auditory domains (Bradley & Lang, 2000b; Eerola & Vuoskoski, 2011). For example, affective pictures and naturally occurring sounds are widely represented across both dimensions (Bradley & Lang, 2000a, 2000b). Studies using retrospective ratings of affect in response to music (i.e., measured after a listener has heard a musical excerpt) show that perceived is significantly associated with acoustic cues such as intensity, spectral flux, and spectral entropy (Gabrielsson & Lindstrom, 2010; Gingras, Marin, & Fitch, 2013; Ilie & Thompson, 2006; Leman, Vermeulen, De Voogdt, Moelants, & Lesaffre, 2005). Furthermore, by using a two-dimensional emotion=space interface (Schubert, 1999), continuous real-time measurements of perceived affect recorded throughout a musical excerpt result in time series models that significantly predict perceived from a small number of musical features (Bailes & Dean, 2012; Dean & Bailes, 2010; Schubert, 2004, 2013). For example, acoustic intensity profiles in Western classical and electroacoustic music can significantly predict continuous changes in perceived arousal, and this has been supported by causal experiments in which the intensity profiles of pieces have been manipulated (Dean & Bailes, 2011). Timbral features such as spectral centroid and spectral flatness have weaker effects (Dean & Bailes, 2010, 2011). On the other hand, perceived valence in response to music is commonly characterized by high variability in retrospective responses (e.g., Gomez & Danuser, 2004; Leman et al., 2005) and low predictive power from computational models (e.g., Bailes & Dean, 2012; Dean & Bailes, 2010; Korhonen, Clausi, & Jernigan, 2006; Schubert, 2004). Furthermore, causal links between acoustic aspects of music and perceived valence have yet to be demonstrated.So why is perceived valence predicted less well by acoustic cues than perceived arousal? Valence may be closely associated with more personal factors that could be culture-specific and less obligatory than the effects observed from acoustic cues on perceived arousal; for example, the individual motivational aspects of attention and interest that underpin listeners' engagement with a piece of music (Broughton, Stevens, & Schubert, 2008; Geringer & Madsen, 2000 =2001; Madsen & Geringer, 2008; Thompson, 2007). In general terms, engagement refers to an active, constructive, focused interaction with one's social and physical environment that relates to cognitive, behavioral, and importantly for the present study, affective elements of motivation (Broughton et al., 2008; Furrer & Skinner, 2003; Reeve, Jang, Carrell, Jeon, & Barch, 2004). In music, a listener's real-time engagement is likely to be associated with greater attention and interest, and correlate (either negatively or positively) with affective valence responses such as enjoyment and pleasantness (or lack thereof)1. …