When looking for an object in a complex visual scene, Augmented Reality (AR) can assist search with visual cues persistently pointing in the target’s direction. The effectiveness of these visual cues can be reduced if they are placed at a different visual depth plane to the target they are indicating. To overcome this visual-depth problem, we test the effectiveness of adding simultaneous spatialized auditory cues that are fixed at the target’s location. In an experiment we manipulated which cue(s) were available (visual-only vs. visual + auditory), and which disparity plane relative to the target the visual cue was displayed on. Results show that participants were slower at finding targets when the visual cue was placed on a different disparity plane to the target. However, this slowdown in search performance could be substantially reduced with auditory cueing. These results demonstrate the importance of AR cross-modal cueing under conditions of visual uncertainty and show that designers should consider augmenting visual cues with auditory ones.