Fujisaki Model Research Articles

Problem statement: In general, there are a number of rural dialects i n Thai. However, four dialects are mainly spoken by Thai people residing in four core region including central, north, northeast and south regions. Recognizing and synthe sizing Thai speech with different dialects are consequently difficult. Approach: Prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalne ss but also the intelligibility of speech. To treat the problem, the speech prosody is carefully preserved through modeling the fundamental frequency (F0) contours. The differences among the model parameters of four Thai dialects have been summarized. This study proposed an analysis of model parameters for Thai speech prosody with four regional dialects and two genders which is a preliminary wor k for speech recognition and synthesis. Fujisaki's modeling; a powerful tool to model the F0 contour h as been adopted. Seven derived parameters from the Fujisaki's model are as follows. The first para meter is baseline frequency which is the lowest lev el of F0 contour. The second and third parameters are the numbers of phrase commands and tone commands which reflect the frequencies of surges of the utterance in global and local levels, respectively. The fourth and fifth parameters are p hrase command and tone command durations which reflect the speed of speaking and the length of a s yllable, respectively. The sixth and seventh parameters are amplitudes of phrase command and tone command which reflect the energy of the global speech and the energy of local syllable. Results: In the experiments, each regional dialect includes 200 samples of one sentence with male and female speech. Therefore our speech database contains 1600 utterances in total. The results show ed that most of the proposed parameters can distinguish four kinds of regional dialects explici tly. Conclusion: By using the Fujisaki's model, the results confirm that the proposed parameters can di stinguish the regional dialects efficiently. In the future research, they were expected to be applied i n the speech recognition and synthesis with various regional dialect characteristics.

Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplished. To achieve the generation of expressive speech, we need to model the fundamental frequency (F0) contours accurately to preserve the speech prosody. Approach: Therefore this study proposes an analysis of model parameters for Thai speech prosody with three speaking styles and two genders which is a preliminary work for speech synthesis. Fujisaki's modeling; a powerful tool to model the F0 contour has been adopted, while the speaking styles of happiness, sadness and reading have been considered. Seven derived parameters from the Fujisaki's model are as follows. The first parameter is baseline frequency which is the lowest level of F0 contour. The second and third parameters are the numbers of phrase commands and tone commands which reflect the frequencies of surges of the utterance in global and local levels, respectively. The fourth and fifth parameters are phrase command and tone command durations which reflect the speed of speaking and the length of a syllable, respectively. The sixth and seventh parameters are amplitudes of phrase command and tone command which reflect the energy of the global speech and the energy of local syllable. Results: In the experiments, each speaking style includes 200 samples of one sentence with male and female speech. Therefore our speech database contains 1200 utterances in total. The results show that most of the proposed parameters can distinguish three kinds of speaking styles explicitly. Conclusion: From the finding, it is a strong evidence to further apply the successful parameters in the speech synthesis systems or other speech processing technologies.

Fujisaki Model Research Articles

Related Topics

Articles published on Fujisaki Model

Speech analysis-synthesis system using genetic algorithm and Fujisaki model and its application to coarticulation

Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space

Generative model of spectra for a word using Fujisaki model and genetic algorithm

Acoustic correlates of focus in Marathi: Production and perception

Improving Mandarin Prosody Generation Using Alternative Smoothing Techniques

Emotional voice conversion system for multiple languages based on three-layered model in dimensional space

Generative Modeling of Voice Fundamental Frequency Contours

Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech

Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison

$$\hbox {F}_{0}$$ F 0 contour generation and synthesis using Bengali Hmm-based speech synthesis system

The influence of speech rate on Fujisaki model parameters

Data-driven intonational phonology

Design and Development of a Prosody Generator for Arabic TTS Systems

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki's Model and Structural Model

Design and Development of a Prosody Generator for Arabic TTS Systems

Fujisaki's Model of Fundamental Frequency Contours for Thai Dialects

Analytical Study on Fundamental Frequency Contours of Thai Expressive Speech Using Fujisaki's Model

Close coupling or points of rendezvous? Connections between intonational events and the segmental grid

EVALUATING INTONATIONAL FEATURES FOR EMOTION RECOGNITION FROM SPEECH

Analysing fundamental frequency contours and local speech rate in map task dialogs

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Fujisaki Model Research Articles

Related Topics

Articles published on Fujisaki Model

Speech analysis-synthesis system using genetic algorithm and Fujisaki model and its application to coarticulation

Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space

Generative model of spectra for a word using Fujisaki model and genetic algorithm

Acoustic correlates of focus in Marathi: Production and perception

Improving Mandarin Prosody Generation Using Alternative Smoothing Techniques

Emotional voice conversion system for multiple languages based on three-layered model in dimensional space

Generative Modeling of Voice Fundamental Frequency Contours

Hierarchical stress modeling and generation in mandarin for expressive Text-to-Speech

Speaker-individuality in Fujisaki model f0 features: implications for forensic voice comparison

$$\hbox {F}_{0}$$ F 0 contour generation and synthesis using Bengali Hmm-based speech synthesis system

The influence of speech rate on Fujisaki model parameters

Data-driven intonational phonology

Design and Development of a Prosody Generator for Arabic TTS Systems

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki's Model and Structural Model

Design and Development of a Prosody Generator for Arabic TTS Systems

Fujisaki's Model of Fundamental Frequency Contours for Thai Dialects

Analytical Study on Fundamental Frequency Contours of Thai Expressive Speech Using Fujisaki's Model

Close coupling or points of rendezvous? Connections between intonational events and the segmental grid

EVALUATING INTONATIONAL FEATURES FOR EMOTION RECOGNITION FROM SPEECH

Analysing fundamental frequency contours and local speech rate in map task dialogs