BACKGROUND: Recent findings of a tight coupling between visual and auditory association cortices during multisensory perception in monkeys and humans raise the question whether consistent paired presentation of simple visual and auditory stimuli prompts conditioned responses in unimodal auditory regions or multimodal association cortex once visual stimuli are presented in isolation in a post-conditioning run. To address this issue fifteen healthy participants partook in a "silent" sparse temporal event-related fMRI study. In the first (visual control) habituation phase they were presented with briefly red flashing visual stimuli. In the second (auditory control) habituation phase they heard brief telephone ringing. In the third (conditioning) phase we coincidently presented the visual stimulus (CS) paired with the auditory stimulus (UCS). In the fourth phase participants either viewed flashes paired with the auditory stimulus (maintenance, CS-) or viewed the visual stimulus in isolation (extinction, CS+) according to a 5:10 partial reinforcement schedule. The participants had no other task than attending to the stimuli and indicating the end of each trial by pressing a button. RESULTS: During unpaired visual presentations (preceding and following the paired presentation) we observed significant brain responses beyond primary visual cortex in the bilateral posterior auditory association cortex (planum temporale, planum parietale) and in the right superior temporal sulcus whereas the primary auditory regions were not involved. By contrast, the activity in auditory core regions was markedly larger when participants were presented with auditory stimuli. CONCLUSION: These results demonstrate involvement of multisensory and auditory association areas in perception of unimodal visual stimulation which may reflect the instantaneous forming of multisensory associations and cannot be attributed to sensation of an auditory event. More importantly, we are able to show that brain responses in multisensory cortices do not necessarily emerge from associative learning but even occur spontaneously to simple visual stimulation.