Abstract

One of the main problems in developing a text-to-speech (TTS) synthesizer for French lies in grapheme-to-phoneme conversion. Automatic converters produce still too many errors in their phoneme sequences, to be helpful for people learning French as a foreign language. The prediction of the phonetic realizations of word-final consonants (WFCs) in general, and liaison in particular (les haricots vs. les escargots), are some of the main causes of such conversion errors. Rule-based methods have been used to solve these issues. Yet, the number of rules and their complex interaction make maintenance a problem. In order to alleviate such problems, we propose here an approach that, starting from a database (compiled from cases documented in the literature), allows to build C4.5 decision trees and subsequently, automate the generation of the required phonetic rules. We investigated the relative efficiency of this method both for classification of contexts and word-final consonant phoneme prediction. A prototype based on this approach reduced Obligatory context classification errors by 52%. Our method has the advantage to spare us the trouble to code rules manually, since they are contained already in the training database. Our results suggest that predicting the realization of WFCs as well as context classification is still a challenge for the development of a TTS application for teaching French pronunciation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call