Abstract

The speech intelligibility index (SII) is a widely used objective method of predicting speech intelligibility, in which the frequency importance function (FIF) is a key component. The FIF characterizes the relative contribution of different frequency bands to speech recognition. In this work, FIFs for Mandarin Chinese were derived for monosyllabic words spoken by male and female speakers. These words were phoneme balanced and selected from the word lists of a national standard, which have been used for measuring the articulation index in China since 1995. A pilot experiment was conducted to determine suitable signal-to-noise ratios (SNR) for measuring speech intelligibility. The main experiment was conducted to derive the FIFs using 288 test conditions (4 SNRs×36 filtering conditions×2 speaker genders). The noise was speech-spectrum shaped and it was generated separately for the male and female speech materials. The results show that, using 1/3 octave analysis bands: (1) The FIF averaged across genders has a peak in the frequency range between 1000 and 2500Hz, which is consistent with the FIF for English monosyllabic words; (2) The frequency bands centered at 160, 1600, and 2000Hz are slightly more important for Mandarin Chinese than for English; (3) Male speech is more intelligible than female speech, and the band centered at 160Hz is more important for female than male speech. The FIF differences between Mandarin and English and the effect of speaker gender are analyzed and discussed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call