Abstract

In recent years, various real life applications such as talking books, gadgets and humanoid robots have drawn the attention to pursue research in the area of expressive speech synthesis. Speech synthesis is widely used in various applications. However, there is a growing need for an expressive speech synthesis especially for communication and robotic. In this paper, global and local rule are developed to convert neutral to storytelling style speech for the Malay language. In order to generate rules, modification of prosodic parameters such as pitch, intensity, duration, tempo and pauses are considered. Modification of prosodic parameters is examined by performing prosodic analysis on a story collected from an experienced female and male storyteller. The global and local rule is applied in sentence level and synthesized using HNM. Subjective tests are conducted to evaluate the synthesized storytelling speech quality of both rules based on naturalness, intelligibility, and similarity to the original storytelling speech. The results showed that global rule give a better result than local rule

Highlights

  • Speech synthesis is the process of converting written text to spoken audio, known as text-to-speech (TTS)

  • Pitch, duration, intensity, tempo and pause are analyzed for modification

  • The modification factors are derived by analyzing the difference between prosodic parameters of neutral and storyteller speech sentences

Read more

Summary

Introduction

Speech synthesis is the process of converting written text to spoken audio, known as text-to-speech (TTS). Rule-based prosody modifications have been a popular approach to incorporate emotions in storytelling speaking style. This approach was undertaken by many expressive TTS in Dutch [6], English [7], Catalan [8], Spanish [9], Indian [10], German [11] and Korean [12] languages. Local rules and global rules of prosody modifications are further investigated to determine their performance in synthesizing storytelling in the Malay language.

Speech data collection
Prosodic parameter analysis
Development of prosodic rules for storytelling speaking style
Duration
Intensity
Evaluations and discussions
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call