Abstract

Problem statement: Fundamental Frequency (F0) is an important speech feature defining the human speech prosody. It is the resultant of th e vibration of human's vocal chords in speech production. In Thai, there are four main dialects s poken by Thai people residing in four core region including central, north, northeast and south regio ns. Environmental noises are also playing an important role in corrupting the speech quality. Th e study of effects of noises the F0 contour for Tha i dialects reveals the important of noise reduction i ssue. Approach: Four types of environmental noises were recorded with different levels of power . It was subsequently mixed with clean speech. The F0 contours from different dialects, different types of noises and different levels of noises was extracted. The difference in term of Root Mean Square Error (RMSE) between the F0 contour of clean speech and the noise-corrupted speech was calculated. Results: In the experiments, each regional dialect includes 10 samples of 10 utteranc es with male and female speech. Four types of noises include train, factory, car and air conditio ner. Moreover, five levels of each type of noise are varied from 0-20 dB. The results show that effe cts of distinguish types of noises are different. Four kinds of regional dialects also cause the diff erences in RMSEs. Conclusion: The recorded noises deteriorate the F0 contours for all Thai dia lects.

Highlights

  • In the recent study on modeling of F0 contour with noisy environment, the simulated noises deteriorate the Fujisaki’s model parameters (Fujisaki and Sudo, 1971; Mixdorff and Fujisaki, 1997; Seresangtakul and Takara, 2003)

  • The sentences have been recorded in four Thai dialects of standard Thai (Center-dialect), Lanna Thai dialect (North-dialect), Lao-style Thai dialect (Northeast-dialect) and South Thai dialect (South-dialect)

  • This study presents a study of effects of noises on F0 contour for Thai dialects

Read more

Summary

INTRODUCTION

Construct the speech database of four Thai dialects. In the recent study on modeling of F0 contour with noisy environment, the simulated noises deteriorate the Fujisaki’s model parameters (Fujisaki and Sudo, 1971; Mixdorff and Fujisaki, 1997; Seresangtakul and Takara, 2003). This study proposes an analysis the differences between the F0 contour of clean speech and the noise-corrupted speech in term of RMSE. The selected four types of noises are air-conditioner, car, factory and train noises. The noise database is constructed with four different types including air-conditioner, car, factory and train noises. The F0 contours of clean speech are extracted from the speech database in the “calculation of F0 contour” stage. They include the standard Thai or Central dialect, Lanna or North dialect, Lao-. Environmental noises: Four types of noises include train, factory, car and air conditioner. They are mixed directly with the pre-recorded clean speech in the speech database. As for the level variation of noises, the levels of each type of noise are varied from 0, 5, 10, 15 and 20 dB, respectively

RESULTS
DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call