Abstract

Text-to-speech conversion can be done with two approaches: dictionary-based (database) approach and grapheme-to-phoneme (G2P) mapping. One of the drawbacks of this approach is its performance depends on the size of the dictionary or database. In the case of domain specific conversion, a simple rule -based technique is used to play pre-recorded audio for each equivalent token. It is easy to design but its limitation is mapping with the sound database and availability of the audio file in the database. In general, grapheme to phoneme conversion can be used in any domain. Advantages are the limited size of the database required, ease of mapping and compliance with domain. However, G2P suffers from pronounce ambiguity (formation of audio output). This paper will discuss about the grapheme-to -phoneme mapping and its application in text to speech conversion system. In this work, Assamese (an Indian scheduled Unicode language) is used as the experimental language and its performance is analysis with another Unicode language (Hindi). English (ASCII) language will be used as a benchmark to compare with the target language.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call