Echolocation is typically associated with bats and toothed whales. To date, only few studies have investigated echolocation in humans. Moreover, these experiments were conducted with real objects in real rooms; a configuration in which features of both vocal emissions and perceptual cues are difficult to analyse and control. We investigated human sonar target-ranging in virtual echo-acoustic space, using a short-latency, real-time convolution engine. Subjects produced tongue clicks, which were picked up by a headset microphone, digitally delayed, convolved with individual head-related transfer functions and played back through earphones, thus simulating a reflecting surface at a specific range in front of the subject. In an adaptive 2-AFC paradigm, we measured the perceptual sensitivity to changes of the range for reference ranges of 1.7, 3.4 or 6.8 m. In a follow-up experiment, a second simulated surface at a lateral position and a fixed range was added, expected to act either as an interfering masker or a useful reference. The psychophysical data show that the subjects were well capable to discriminate differences in the range of a frontal reflector. The range-discrimination thresholds were typically below 1 m and, for a reference range of 1.7 m, they were typically below 0.5 m. Performance improved when a second reflector was introduced at a lateral angle of 45°. A detailed analysis of the tongue clicks showed that the subjects typically produced short, broadband palatal clicks with durations between 3 and 15 ms, and sound levels between 60 and 108 dB. Typically, the tongue clicks had relatively high peak frequencies around 6 to 8 kHz. Through the combination of highly controlled psychophysical experiments in virtual space and a detailed analysis of both the subjects' performance and their emitted tongue clicks, the current experiments provide insights into both vocal motor and sensory processes recruited by humans that aim to explore their environment by echolocation.