Abstract
Most areas related to language and speech technology, directly or indirectly, require handling of unrestricted text, and Text-to-speech systems directly need to work on real text. To build a natural sounding speech synthesis system, it is essential that the text processing component produce an appropriate sequence of phonemic units corresponding to an arbitrary input text. A novel approach is used, where the input text is tokenized, and classification is done based on token type. The token sense disambiguation is achieved by the semantic nature of the language and then the expansion rules are applied to get the normalized text. However, for Telugu language not much work is done on text normalization. In this paper we discuss our efforts for designing a rule based system to achieve text normalization in the context of building Telugu text-to-speech system.
Highlights
The objective of the text processing component [1, 2] is to process the given input text and produce the written form of the text into the spoken form. This orthographic form is realized by the speech generation component either by synthesis from parameters or by selection of a unit from a large speech corpus
For natural sounding speech synthesis [3, 4], it is essential that the text processing component produce an appropriate sequence of orthographic units corresponding to an arbitrary input text
This paper presents the need for text to be preprocessed before it is handed to any synthesizer
Summary
The objective of the text processing component [1, 2] is to process the given input text and produce the written form (orthographic form) of the text into the spoken form. This orthographic form is realized by the speech generation component either by synthesis from parameters or by selection of a unit from a large speech corpus. For natural sounding speech synthesis [3, 4], it is essential that the text processing component produce an appropriate sequence of orthographic units corresponding to an arbitrary input text. The standard word representation is achieved using the expansion rules and the look up table (database)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.