Abstract

The peripheral auditory system functions like a frequency analyser, often modelled as a bank of non-overlapping band-pass filters called critical bands; 20 bands are necessary for simulating frequency resolution of the ear within an ordinary frequency range of speech (up to 7,000 Hz). A far smaller number of filters seemed sufficient, however, to re-synthesise intelligible speech sentences with power fluctuations of the speech signals passing through them; nevertheless, the number and frequency ranges of the frequency bands for efficient speech communication are yet unknown. We derived four common frequency bands—covering approximately 50–540, 540–1,700, 1,700–3,300, and above 3,300 Hz—from factor analyses of spectral fluctuations in eight different spoken languages/dialects. The analyses robustly led to three factors common to all languages investigated—the low & mid-high factor related to the two separate frequency ranges of 50–540 and 1,700–3,300 Hz, the mid-low factor the range of 540–1,700 Hz, and the high factor the range above 3,300 Hz—in these different languages/dialects, suggesting a language universal.

Highlights

  • Plomp and colleagues[1,2,3,4] found that two acoustic principal components were enough to represent Dutch steady vowels

  • Four blocks of critical bands, i.e., four frequency bands, consistently appeared in both the three- (Fig. 1a) and the four-factor (Fig. 1b) results—one of the factors obtained in the three-factor analysis was bimodal, both threeand four-factor analyses yielded four frequency bands

  • The discrepancies between languages/dialects, observed in the lowest frequency band in the four-factor analysis, is likely to have been caused by the inclusion of samples spoken by speakers with relatively high fundamental frequency that could make frequency components too sparse in spectra

Read more

Summary

Introduction

Plomp and colleagues[1,2,3,4] found that two acoustic principal components were enough to represent Dutch steady vowels They extracted principal components from level profiles obtained from a bank of bandpass filters with bandwidths similar to those of critical bands, representing frequency-analysis properties of the auditory periphery, i.e., the basilar membrane[3,5,6,7,8,9]. The principal-component-analysis (PCA) technique as pioneered by Plomp and his colleagues, and further pursued by Zahorian and Rothenberg[10], was extended in two aspects: First, it was applied to a database[11] of complete spoken sentences (58–200, depending on languages) rather than steady vowels, and second, the sentences were spoken in eight different languages/dialects, i.e., American English, British English, Cantonese, French, German, Japanese, Mandarin, and Spanish by 10–20 speakers in each language (Table 1; Supplementary Fig. S1 shows a block diagram of the analyses)

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.