ABSTRACT Psycholinguistic research on metaphor has focused on verbal material. Yet, metaphors frequently occur in a multimodal format, blending words and pictures to convey meaning. Here we compared verbal and multimodal metaphors by using item pairs where stimulus one was always a word (e.g., language in the metaphorical conditions and river in the literal conditions) and stimulus two was either a word (bridge) or a picture (the image of a bridge). The two types of metaphors elicited a similar N400 effect compared to literal pairs; at later latencies, visual metaphors were associated with a more pronounced negativity compared to literal pictures, whereas no effect was observed in the verbal domain. These findings indicate that both visual and verbal metaphors recruit conceptual operations reflected in the N400, but for visual metaphors, elaboration lasts longer. This difference in time course was driven by the low number of alternative interpretations and their closedness, pointing to the costs, rather than the facilitation, of integrating visual signs at an abstract level of representation.