Native speech perception is generally assumed to be highly efficient and accurate. Very little research has, however, directly examined the limitations of native perception, especially for contrasts that are only minimally differentiated acoustically and articulatorily. Here, we demonstrate that native speech perception may indeed be more difficult than is often assumed, where phonemes are highly similar, and we address the nature and extremes of consonant perception. We present two studies of native and non-native (English) perception of the acoustically and articulatorily similar four-way coronal stop contrast /t ʈ ȶ/ (apico-alveolar, apico-retroflex, lamino-dental, lamino-alveopalatal) of Wubuy, an indigenous language of Australia. The results show that all listeners find contrasts involving /ȶ/ easy to discriminate, but that, for both groups, contrasts involving /t ʈ / are much harder. Where the two groups differ, the results largely reflect native language (Wubuy vs English) attunement as predicted by the Perceptual Assimilation Model [1, 2, 3]. We also observe striking perceptual asymmetries in the native listeners' perception of contrasts involving the latter three stops, likely due to the differences in input frequency. Such asymmetries have not previously been observed in adults, and we propose a novel Natural Referent Consonant Hypothesis to account for the results.