Every day we integrate meaningful information coming from different sensory modalities, and previous work has debated whether conceptual knowledge is represented in modality-specific neural stores specialized for specific types of information, and/or in an amodal, shared system. In the current study, we investigated semantic processing through a cross-modal paradigm which asked whether auditory semantic processing could be modulated by the constraints of context built up across a meaningful visual narrative sequence. We recorded event-related brain potentials (ERPs) to auditory words and sounds associated to events in visual narratives-i.e., seeing images of someone spitting while hearing either a word (Spitting!) or a sound (the sound of spitting)-which were either semantically congruent or incongruent with the climactic visual event. Our results showed that both incongruent sounds and words evoked an N400 effect, however, the distribution of the N400 effect to words (centro-parietal) differed from that of sounds (frontal). In addition, words had an earlier latency N400 than sounds. Despite these differences, a sustained late frontal negativity followed the N400s and did not differ between modalities. These results support the idea that semantic memory balances a distributed cortical network accessible from multiple modalities, yet also engages amodal processing insensitive to specific modalities.