Abstract
It is generally recognized that consonants are more critical than vowels to speech intelligibility, but we suggest that important information is contained in transient speech components, rather than the quasi-steady-state components of both consonants and vowels. Fixed-frequency filters cannot uniquely separate transients from the more steady-state vowel formants and consonant hubs, even though the former are predominately low frequency and the latter, high frequency. To study the relative speech intelligibility of the transient versus steady-state components, we employed an algorithm based on time-frequency analysis to extract quasi-steady-state energy from the speech signal, leaving a residual signal of predominantly transient components. Psychometric functions were measured for speech recognition of processed and unprocessed monosyllabic words. The transient components were found to account for approximately 2% of the energy of the original speech, yet were nearly equally intelligible. As hypothesized, the quasi-steady-state components contained much greater energy while providing significantly less intelligibility.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.