Generations of researchers observed a mismatch between headphone and loudspeaker presentation: the sound pressure level at the eardrum generated by a headphone has to be about 6 dB higher compared to the level created by a loudspeaker that elicits the same loudness. While it has been shown that this effect vanishes if the same waveforms are generated at the eardrum in a blind comparison, the origin of the mismatch is still unclear. We present new data on the issue that systematically characterize this mismatch under variation of the stimulus frequency, presentation room, and binaural parameters of the headphone presentation. Subjects adjusted the playback level of a headphone presentation to equal loudness as loudspeaker presentation, and the levels at the eardrum were determined through appropriate transfer function measurements. Identical experiments were conducted at Oldenburg and Aachen with 40 normal-hearing subjects including 14 that passed through both sites. Our data verify a mismatch between loudspeaker and binaural headphone presentation, especially at low frequencies. This mismatch depends on the room acoustics, and on the interaural coherence in both presentation modes. It vanishes for high frequencies and broadband signals if individual differences in the sound transfer to the eardrums are accounted for. Moreover, small acoustic and non-acoustic differences in an anechoic reference environment (Oldenburg vs. Aachen) exert a large effect on the recorded loudness mismatch, whereas not such a large effect of the respective room is observed across moderately reverberant rooms at both sites. Hence, the non-conclusive findings from the literature appear to be related to the experienced disparity between headphone and loudspeaker presentation, where even small differences in (anechoic) room acoustics significantly change the response behavior of the subjects. Moreover, individual factors like loudness summation appear to be only loosely connected to the observed mismatch, i.e., no direct prediction is possible from individual binaural loudness summation to the observed mismatch. These findings – even though not completely explainable by the yet limited amount of parameter variations performed in this study – have consequences for the comparability of experiments using loudspeakers with conditions employing headphones or other ear-level hearing devices.