Modeling Phone Duration of Lithuanian by Classification and Regression Trees, using Very Large Speech Corpus

Giedrius Norkevičius,Gailius Raškinis

doi:10.15388/informatica.2008.213

Abstract

Classification and regression tree approach was used in this research to model phone duration of Lithuanian. 300 thousand samples of vowels and 400 thousand samples of consonants extracted from VDU-AB20 corpus were used in experimental part of research. Set of 15 parameters characterizing phone and its context were selected for duration prediction. The most significant of them were: identifier (ID) of phone being predicted, adjacent phones IDs and number of phones in syllable. Models were built using two different data sets: one speaker and 20 speakers. The influence of cost complexity pruning and different values of pre pruning were investigated. Prediction by average leaf duration vs. prediction by median leaf duration was also compared. Investigation of most vivid errors was performed, speech rate normalization and trivial noise reduction were applied and influence on models evaluation parameters discussed. The achieved results, correlation 0.8 and 0.75 respectively for vowels and consonants, and RMSE of ~18 ms are comparable with those reported for Check, Hindi and Telugu, Korean.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modeling Phone Duration of Lithuanian by Classification and Regression Trees, using Very Large Speech Corpus

Abstract

Talk to us

Similar Papers

More From: Informatica

Lead the way for us

Journal: Informatica	Publication Date: Jan 1, 2008
Citations: 19

Similar Papers

Dissecting Alzheimer's Disease Risk in Asian American Elders: A Classification and Regression Tree Approach.
Sung Seek Moon ... Lindsey Anderson
Journal of Alzheimer's Disease Reports | VOL. 8
Sung Seek Moon, et. al.Sung Seek Moon ... Lindsey Anderson
19 Mar 2024
Journal of Alzheimer's Disease Reports | VOL. 8

A semantic‐based classification and regression tree approach for modelling complex spatial rules in motor vehicle crashes domain
Meysam Effati ... Abolghasem Sadeghi‐Niaraki
WIREs Data Mining and Knowledge Discovery | VOL. 5
Meysam Effati, et. al.Meysam Effati ... Abolghasem Sadeghi‐Niaraki
05 Jun 2015
WIREs Data Mining and Knowledge Discovery | VOL. 5

Classification and Regression Tree Approach for Predicting Drivers’ Merging Behavior in Short-Term Work Zone Merging Areas
Qiang Meng ... Jinxian Weng
Journal of Transportation Engineering | VOL. 138
Qiang Meng, et. al.Qiang Meng ... Jinxian Weng
09 Jan 2012
Journal of Transportation Engineering | VOL. 138

Using change-point analysis and weighted averaging approaches to explore the relationships between common benthic diatoms and in-stream environmental variables in mid-atlantic highlands streams, USA
Christine L Weilhoefer ... Yangdong Pan
Hydrobiologia | VOL. 614
Christine L Weilhoefer, et. al.Christine L Weilhoefer ... Yangdong Pan
23 Jul 2008
Hydrobiologia | VOL. 614

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling Phone Duration of Lithuanian by Classification and Regression Trees, using Very Large Speech Corpus

Abstract

Talk to us

Similar Papers

More From: Informatica