Social and emotional cues from faces and voices are highly relevant and have been reliably demonstrated to attract attention involuntarily. However, there are mixed findings as to which degree associating emotional valence to faces occurs automatically. In the present study, we tested whether inherently neutral faces gain additional relevance by being conditioned with either positive, negative, or neutral vocal affect bursts. During learning, participants performed a gender-matching task on face-voice pairs without explicit emotion judgments of the voices. In the test session on a subsequent day, only the previously associated faces were presented and had to be categorized regarding gender. We analyzed event-related potentials (ERPs), pupil diameter, and response times (RTs) of N = 32 subjects. Emotion effects were found in auditory ERPs and RTs during the learning session, suggesting that task-irrelevant emotion was automatically processed. However, ERPs time-locked to the conditioned faces were mainly modulated by the task-relevant information, that is, the gender congruence of the face and voice, but not by emotion. Importantly, these ERP and RT effects of learned congruence were not limited to learning but extended to the test session, that is, after removing the auditory stimuli. These findings indicate successful associative learning in our paradigm, but it did not extend to the task-irrelevant dimension of emotional relevance. Therefore, cross-modal associations of emotional relevance may not be completely automatic, even though the emotion was processed in the voice.
Read full abstract