Abstract

In this paper we demonstrate the use of prosody models for developing speech systems in Indian languages. Duration and intonation models developed using feedforward neural networks are considered as prosody models. Labelled broadcast news data in the languages Hindi, Telugu, Tamil and Kannada is used for developing the neural network models for predicting the duration and intonation. The features representing the positional, contextual and phonological constraints are used for developing the prosody models. In this paper, the use of prosody models is illustrated using speech recognition, speech synthesis, speaker recognition and language identification applications. Autoassociative neural networks and support vector machines are used as classification models for developing the speech systems. The performance of the speech systems has shown to be improved by combining the prosodic features along with one popular spectral feature set consisting of Weighted Linear Prediction Cepstral Coefficients (WLPCCs).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.