Abstract

Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in such noise-vocoded speech.

Highlights

  • It is important to understand what acoustic characteristics of speech sounds are essential for speech intelligibility in order to elucidate the cognitive mechanisms of speech communication

  • The acoustic characteristics of speech that contribute to speech perception have been investigated with many different approaches

  • A plausible explanation for the lower cumulative contributions for Mandarin Chinese is that the mean fundamental frequency of Mandarin Chinese speech was higher than that of the other languages, and that the cepstral liftering could not smooth the power spectra sufficiently

Read more

Summary

Introduction

It is important to understand what acoustic characteristics of speech sounds are essential for speech intelligibility in order to elucidate the cognitive mechanisms of speech communication. Perceptual experiments in which spectral information was systematically degraded revealed that perceptual cues embedded in speech spectra are highly redundant (Remez et al, 1981; Baer and Moore, 1993; Shannon et al, 1995; Warren et al, 1995) These studies often proceeded from the concept of auditory filters (Patterson, 1974; Moore, 2012) or critical bands (Fletcher, 1940), indicating parallel channels to process frequency components. Many studies (e.g., Dorman et al, 1997; Loizou et al, 1999; Souza and Rosen, 2009; Ellermeier et al, 2015) have measured the intelligibility of noise-vocoded speech, and indicated results consistent with Shannon et al (1995) These studies suggest that the 20 outputs of critical-band filters, for example, can be reduced to a smaller number of channels without sacrificing the speech intelligibility too much

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.