Abstract
Problem statement: In speech communication, noises from surrounding environments affect the communication quality with various aspects. The received speech quality should be analyzed to see how the important noises reduce the speech quality so that we can eliminate them in an appropriate way. Approach: This study presents a study on the analysis of the noise effects on Thai speech. Four kinds of noises; air conditioner, car, factory and train, are chosen to be simulated in the study. The various levels of signal-to-noise ratios are conducted. The root mean square error between the fundamental frequency contours of the corrupted speech and the clean speech is calculated. Finally, the analysis of the root mean square error in terms of comparisons among genders, the four kinds of noises and various levels of signal-to-noise ratios is performed. Results: In the experiments, 400 speech utterances of male and female are used as speech materials. The average values of root mean square error are calculated. The results show that the fundamental frequency contour of female speech is affected more than that of male speech. Comparing among four kinds of noises, the car noise has the highest influence, while the factory noise has the lowest influence. Moreover, the root mean square error is inversely proportional to the level of signal-to-noise ratio. Conclusion: From the finding, the noises from surrounding environments have affected the speech quality of fundamental frequency contour. This study is the preliminary knowledge to enhance the speech quality for further works such as speech synthesis systems or other speech processing technologies.
Highlights
Contour of the speech by varying the level of signal-tonoise ratio
Speech analysis has in further study in advanced research such as speech been conducted for many languages
Fundamental Frequency Contour (F0 contour): There is a substantial amount of data on the frequency of the voice fundamental or fundamental frequency (F0) in the speech of speakers who differ in age and sex
Summary
Contour of the speech by varying the level of signal-tonoise ratio. It is expected to apply the finding knowledge. The fundamental frequency extracted frame-by-frame from the speech is an important feature indicating the pitch or voicing level of the speech. It has been widely exploited in most of speech processing technology mentioned above. The study of the affect of surrounding noises to the fundamental frequency should be conducted appropriately. F0 is quite stationary up to the period of menopause, when it decreases to reach the minimum which is about 15 Hz lower around 70 years of age (Pegoraro-Krook, 1988). The dramatic decrease in F0 during puberty duration has been observed to continue with subsequent deceleration until about 35 years of age. Thereafter, at about 55 years of age, F0 begins to rise again (Pegoraro-Krook, 1988)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.