Abstract

The relation between the hearing threshold of burst noise and some speech properties was measured by the sequential estimation method. A noise burst with a duration of 2–100 ms, LPC‐based synthetic speech, and natural speech were used. Synthetic speech samples were generated from LPC parameters of the original speech samples—five Japanese vowels spoken by female speakers—with noise and pulse train excitation. Speech samples and a noise burst were added to produce stimuli, and the subjects were instructed to detect the noise burst in it. The detectability of a short noise burst (by 10 ms) was found to increase with increments of pitch duration and differ from vowel to vowel. However, the detectability of a relatively long noise burst (by 100 ms) seemed independent of excitation source parameters: The synchronization effect between short‐term speech energy and minimum audible burst noise level was also observed for a very short noise burst (by 3 ms). A signal detection model of the human auditory system was proposed to confirm good correlation with these results. These results prove that it is possible to improve synthetic speech quality and waveform coding system performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.