Tree-based Phone Duration Modelling of the Serbian Language

S Sovilj-Nikic,M Markovic,V Delic,I Sovilj-Nikic

doi:10.5755/j01.eee.20.3.4090

Abstract

Considering the importance of segmental duration from a perceptive point of view, the possibility of automatic prediction of natural duration of phones is essential for achieving the naturalness of synthesized speech. In this paper phone duration prediction model for the Serbian language using tree-based machine learning approach is presented. A large speech corpus and a feature set of 21 parameters describing phones and their contexts were used for segmental duration prediction. Phone duration modelling is based on attributes such as the current segment identity, preceding and following segment types, manner of articulation (for consonants) and voicing of neighbouring phones, lexical stress, part-of-speech, word length, the position of the segment in the syllable, the position of the syllable in a word, the position of a word in a phrase, phrase break level, etc. These features have been extracted from the large speech database for the Serbian language. The results obtained for the full phoneme set using regression tree, RMSE (root-mean-squared-error) 14.8914 ms, MAE (mean absolute error) 11.1947 ms and correlation coefficient 0.8796 are comparable with those reported in the literature for Czech, Greek, Lithuanian, Korean, Indian languages Hindi and Telugu, Turkish. DOI: http://dx.doi.org/10.5755/j01.eee.20.3.4090

Highlights

In natural speech the duration of speech segments depends on the context of speech, where that dependence is very complex and involves many factors [1]
These algorithms have been used for building binary decision trees on a large speech corpus which contains 98214 phonemes including 38543 vowels and 59671 consonants
It can be noticed that the results achieved using regression tree for the full phoneme set in the Serbian language RMSE 14.8914 ms, mean absolute error (MAE) 11.1947 ms and CC 0.8796 are comparable with or even outperform the results reported in the literature for different languages

Summary

INTRODUCTION

In natural speech the duration of speech segments depends on the context of speech, where that dependence is very complex and involves many factors [1]. Linear statistical models, models obtained using a neural network and models based on decision trees The first such model for predicting the duration of speech segments in American English was developed by Riley [6] using the CART (Classification and Regression Trees) technique. One of the main advantages of the CART method is the ability to find out structural relationships between the predicted and actual values [7] This is the reason why the CART method is commonly used in the initial stages of phone duration modelling research.

FEATURE SET FOR THE SERBIAN LANGUAGE

PHONE DURATION MODELLING

EXPERIMENTAL RESULTS

V.CONCLUSIONS

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics and Electrical Engineering	Publication Date: Mar 18, 2014
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

Tree-based Phone Duration Modelling of the Serbian Language

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics and Electrical Engineering

Lead the way for us

Similar Papers

The development of phone duration model in speech synthesis in the Serbian language
Sandra Sovilj-Nikic ... Ivan Sovilj-Nikic
-
Sandra Sovilj-Nikic, et. al.Sandra Sovilj-Nikic ... Ivan Sovilj-Nikic
01 Nov 2015
01 Nov 2015

Short and medium-term forecasting of cooling and heating load demand in building environment with data-mining based approaches
Tanveer Ahmad ... Huanxin Chen
Energy and Buildings | VOL. 166
Tanveer Ahmad, et. al.Tanveer Ahmad ... Huanxin Chen
14 Feb 2018
Energy and Buildings | VOL. 166

Modeling the flow rate of dry part in the wet gas mixture using decision tree/kernel/non-parametric regression-based soft-computing techniques
Zhanat Dayev ... Emel Kıyan
Flow Measurement and Instrumentation | VOL. 86
Zhanat Dayev, et. al.Zhanat Dayev ... Emel Kıyan
25 May 2022
Flow Measurement and Instrumentation | VOL. 86

AFFRICATES, NASAL-OBSTRUENT SEQUENCES AND PHRASAL ACCENT IN TAJIO
Luh Anik Mayani
Linguistik Indonesia | VOL. 34
Luh Anik MayaniLuh Anik Mayani
25 Feb 2015
Linguistik Indonesia | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tree-based Phone Duration Modelling of the Serbian Language

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics and Electrical Engineering