Throughout the day, humans react to multisensory events conveying both visual and auditory signals by rapidly reorienting their gaze. Several studies showed that sounds can impact the latency of visually guided saccades depending on when and where they are delivered. Unlocalized beeps delivered near the onset time of a visual target reduce latencies, more for early beeps and less for late beeps [1]. However, this modulation is far weaker than for perceptual temporal judgments [2]. Here we tested our previous assumption that beeps shift the perceived timing of target onset and result in two competing effects on saccade latencies: a multisensory modulation in line with the expected perceptual effect and an illusory gap/overlap effect, resulting from target appearance being perceived later/closer in time than fixation offset and shortening/lengthening saccade latencies. Gap/overlap effects involve an oculomotor component associated with neuronal activity in the superior colliculus (SC), a multisensory subcortical structure devoted to sensory-motor transformation. We therefore predicted that the interfering illusory gap/overlap effect would be weaker for manual responses, which involve distinct multisensory areas. In three experiments we manipulated the delay between target onset and an irrelevant auditory beep (stimulus onset asynchrony; SOA) and between target onset and fixation offset (real gap/overlap). Targets appeared left/right of fixation and participants were instructed to make quick saccades or button presses towards the targets. Adding a real overlap/gap (50% of SOA) compensated for the illusory gap/overlap by increasing the beep-related modulation of saccade latencies across the entire SOA range, whereas it barely affected manual responses. However, although auditory and gap/overlap effects modulated saccade latencies in similar ways, these were additive and could saturate, suggesting that they reflect independent mechanisms. Therefore, multisensory temporal binding affects perception and oculomotor control differently, likely due to the implication of the SC in saccade programming and multisensory integration.