Generation of suprasegmental information for speech using a recurrent neural network and binary gravitational search algorithm for feature selection

Mansour Sheikhan

doi:10.1007/s10489-013-0505-x

Abstract

Suprasegmental (prosody) features of discourse provide a vehicle by which speakers reflect their mental purposes to listeners. Generating suitable prosody information is critical to expressing messages and improving the intelligibility and naturalness of synthetic speech. Generic prosody generators should provide information about pitch frequency (F 0) contours, energy levels, word durations, and inter-word pause durations for speech synthesizers. The present study used a recurrent neural network (RNN) for prosody generation. The inputs of this RNN were word-level and syllable-level linguistic features. To provide data efficiently for the RNN-based prosody generator in the training, validation, and test phases, automatic segmentation and labeling of phonemes were performed. The number of inputs to the RNN was reduced by employing a binary gravitational search algorithm (BGSA) for feature selection (FS). The proposed prosody generator provided 12 output prosodic parameters for the current syllable for representing pitch contour, log-energy contour, inter-syllable pause duration, duration of syllable, duration of the vowel in the syllable, and vowel onset time. Experimental results demonstrated the success of the RNN-based prosody generator in synthesizing the six prosodic elements with acceptable root mean square error (RMSE). By using a BGSA-based FS unit, a lighter neural model was achieved with a 53 % reduction in the number of weight connections, producing RMSEs with acceptable degradation over the no-FS unit prosody generator. The performance of the BGSA-based FS method was compared with a binary particle swarm optimization (BPSO) algorithm, and the BGSA showed slightly better results. A modified mean opinion score scale was used to evaluate the intelligibility and naturalness of synthesized speech using the proposed method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Generation of suprasegmental information for speech using a recurrent neural network and binary gravitational search algorithm for feature selection

Abstract

Talk to us

Similar Papers

More From: Applied Intelligence

Lead the way for us

Journal: Applied Intelligence	Publication Date: Jan 12, 2014
Citations: 11

Similar Papers

Introducing clustering based population in Binary Gravitational Search Algorithm for Feature Selection
Ritam Guha ... Seyedali Mirjalili
Applied Soft Computing | VOL. 93
Ritam Guha, et. al.Ritam Guha ... Seyedali Mirjalili
08 May 2020
Applied Soft Computing | VOL. 93

A novel extended binary cuckoo search algorithm for feature selection
Sadegh Salesi ... Georgina Cosma
-
Sadegh Salesi, et. al.Sadegh Salesi ... Georgina Cosma
01 Oct 2017
01 Oct 2017

Optimizing Cuckoo Feature Selection Algorithm with the New Initialization Strategy and Fitness Function
Yingying Wang ... Zhanshan Li
-
Yingying Wang, et. al.Yingying Wang ... Zhanshan Li
01 Jan 2018
01 Jan 2018

Improving the Binary Fish School Search Algorithm for feature selection
Raphael F Carneiro ... Carmelo J A Bastos-Filho
-
Raphael F Carneiro, et. al.Raphael F Carneiro ... Carmelo J A Bastos-Filho
01 Nov 2016
01 Nov 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Generation of suprasegmental information for speech using a recurrent neural network and binary gravitational search algorithm for feature selection

Abstract

Talk to us

Similar Papers

More From: Applied Intelligence