UC Berkeley Phonology Lab Annual Report (2008) Languages’ sound inventories: the devil in the details John J. Ohala 1. Introduction In this paper I am going to modify somewhat a statement made in Ohala (1980) regarding languages’ speech sound inventories exhibiting the ‘maximum use of a set of distinctive features’. In that paper, after noting that vowel systems seem to conform to the principle of maximal acoustic-perceptual dif- ferentiation (as proposed earlier by Bjorn Lindblom), I observe: ... it would be most satisfying if we could apply the same principles to predict the arrangement of consonants, i.e., posit an acoustic-auditory space and show how the consonants position themselves so as to maximize the inter-consonantal distance. Were we to attempt this, we should undoubtedly reach the patently false prediction that a 7 consonant system should include something like the following: , k’, ts, , m, r, . Languages which do have few consonants, such as the Polynesian languages, do not have such an exotic inventory. In fact, the languages which do possess the above set (or close to it) such as Zulu, also have a great many other consonants of each type, i.e., ejectives, clicks, affricates, etc. Rather than maximum differentiation of the entities in the consonant space, we seem to find something approximating the principle which would be characterized as “maximum utilization of the available dis- tinctive features”. This has the result that many of the consonants are, in fact, perceptually quite close – differing by a minimum, not a maximum number of distinctive features. i Looking at moderately large to quite large segment inventories like those in English, French, Hindi, Zulu, Thai, this is exactly the case. Many segments are phonetically similar and as a consequence are confusable. Some data showing relatively high rates of confusion of certain CV syllables (presented in isola- tion, hi-fi listening condition) (from Winitz et al. 1972) are given in Table 1. Table 1. Confusion matrix from Winitz et al. (1972). Spoken syllables consisted of stop burst plus 100 msec of following transition and vowel; high-fidelity listening conditions. Numbers given are the incidence of the specified response to the specified stimulus. Stimulus: Response: /pi/ /pa/ /pu/ /p/ /t/ /k/ /ti/ /ta/ /tu/ /ki/ /ka/ /ku/ I do actually believe that the degree of auditory distinctness plays some role in shaping languages’ segment inventories—especially when auditory distinctness is low. Sound change, acting blindly (i.e., non-teleologically), weeds out similar sounding elements through confusion which results in mergers