When a picture is repeatedly named in the context of semantically related pictures, (homogeneous context) responses are slower than when the picture is repeatedly named in the context of unrelated pictures (heterogeneous context). This semantic interference effect in blocked-cyclic naming plays an important role in devising theories of word production. Wöhner, Mädebach, and Jescheniak [2021; Wöhner, S., Mädebach, A., & Jescheniak, J. D. Naming pictures and sounds: Stimulus type affects semantic context effects. Journal of Experimental Psychology: Human Perception and Performance, 47, 716-730, 2021] have shown that the effect is substantially larger when participants name environmental sounds than when they name pictures. We investigated possible reasons for this difference, using EEG and pupillometry. The behavioral data replicated Wöhner and colleagues. ERPs were more positive in the homogeneous compared with the heterogeneous context over central electrode locations between 140-180 msec and 250-350 msec for picture naming and between 250 and 350 msec for sound naming, presumably reflecting semantic interference during semantic and lexical processing. The later component was of similar size for pictures and sounds. ERPs were more negative in the homogeneous compared with the heterogeneous context over frontal electrode locations between 400 and 600 msec only for sounds. The pupillometric data showed a stronger pupil dilation in the homogeneous compared with the heterogeneous context only for sounds. The amplitudes of the late ERP negativity and pupil dilation predicted naming latencies for sounds in the homogeneous context. The latency of the effects indicates that the difference in semantic interference between picture and sound naming arises at later, presumably postlexical processing stages closer to articulation. We suggest that the processing of the auditory stimuli interferes with phonological response preparation and self-monitoring, leading to enhanced semantic interference.