Text Normalization for Telugu Text-to-Speech Synthesis

Dr K.V.N Sunitha,P.Sunitha Devi

doi:10.24297/ijct.v11i2.1176

Abstract

Most areas related to language and speech technology, directly or indirectly, require handling of unrestricted text, and Text-to-speech systems directly need to work on real text. To build a natural sounding speech synthesis system, it is essential that the text processing component produce an appropriate sequence of phonemic units corresponding to an arbitrary input text. A novel approach is used, where the input text is tokenized, and classification is done based on token type. The token sense disambiguation is achieved by the semantic nature of the language and then the expansion rules are applied to get the normalized text. However, for Telugu language not much work is done on text normalization. In this paper we discuss our efforts for designing a rule based system to achieve text normalization in the context of building Telugu text-to-speech system.

Highlights

The objective of the text processing component [1, 2] is to process the given input text and produce the written form of the text into the spoken form. This orthographic form is realized by the speech generation component either by synthesis from parameters or by selection of a unit from a large speech corpus
For natural sounding speech synthesis [3, 4], it is essential that the text processing component produce an appropriate sequence of orthographic units corresponding to an arbitrary input text
This paper presents the need for text to be preprocessed before it is handed to any synthesizer

Summary

INTRODUCTION

The objective of the text processing component [1, 2] is to process the given input text and produce the written form (orthographic form) of the text into the spoken form. This orthographic form is realized by the speech generation component either by synthesis from parameters or by selection of a unit from a large speech corpus. For natural sounding speech synthesis [3, 4], it is essential that the text processing component produce an appropriate sequence of orthographic units corresponding to an arbitrary input text. The standard word representation is achieved using the expansion rules and the look up table (database)

Nature and Format of Telugu Text

PROPOSED MODEL FOR TEXT NORMALIZATION

Tokenization and Token Classification

Token Sense Disambiguation

Standard Word Generation

IMPLEMENTATION OF THE SYSTEM

Tokenization

Token Classification

Cardinal Numbers

Ordinal Numbers

Decimal Numbers

Phone Numbers

Date Formats

Currency

Abbreviations and Acronyms

Address

Percentages

Coverage Analysis

CONCLUSIONS

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY	Publication Date: Oct 10, 2013
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Text Normalization for Telugu Text-to-Speech Synthesis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY

Lead the way for us

Similar Papers

Text normalization with convolutional neural networks
Sevinj Yolchuyeva ... Bálint Gyires-Tóth
International Journal of Speech Technology | VOL. 21
Sevinj Yolchuyeva, et. al.Sevinj Yolchuyeva ... Bálint Gyires-Tóth
30 May 2018
International Journal of Speech Technology | VOL. 21

Rewrite Rules
Kyle Gorman ... Richard Sproat
-
Kyle Gorman, et. al.Kyle Gorman ... Richard Sproat
01 Jan 2020
01 Jan 2020

European Language Grid: Introduction
Georg Rehm
-
Georg RehmGeorg Rehm
02 Nov 2022
02 Nov 2022

Corpus Editing and Text Normalization
Niladri Sekhar Dash ... L Ramamoorthy
-
Niladri Sekhar Dash, et. al.Niladri Sekhar Dash ... L Ramamoorthy
14 Aug 2018
14 Aug 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text Normalization for Telugu Text-to-Speech Synthesis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: INTERNATIONAL JOURNAL OF COMPUTERS &amp; TECHNOLOGY

More From: INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY