Text to speech using Mel-Spectrogram with deep learning algorithms

Abdulamir A Karim,Suha Mohammed Saleh

doi:10.21533/pen.v10i3.3113

Abstract

The purpose of text to speech (TTS), sometimes called speech synthesis, is to synthesize a natural and intelligible speech for a given text. A wide range of applications uses TTS technologies in media, chatbots, and entertainment, among other fields, making it a hot topic for the research community. Recently, the progress achieved by artificial intelligence, especially in deep learning and neural networks, enables TTS to produce a high-quality synthesized speech. However, despite the success achieved, currently, available works suffer from the need for very long training and inference time, which makes it dominated by big tech companies. This paper proposes a model based on convolutional neural networks (CNN) and gated recurrent units (GRU). The proposed model can work even in low computational environments and requires low training time. The MOS achieved is 4.26, higher than the MOS performed by state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Periodicals of Engineering and Natural Sciences (PEN)	Publication Date: Jun 29, 2022
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Text to speech using Mel-Spectrogram with deep learning algorithms

Abstract

Talk to us

Similar Papers

More From: Periodicals of Engineering and Natural Sciences (PEN)

Lead the way for us

Similar Papers

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34
--
12 Apr 2022
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34

Focus issue: Artificial intelligence in medical physics.
F Zanca ... O Diaz
Physica Medica | VOL. 83
F Zanca, et. al.F Zanca ... O Diaz
01 Mar 2021
Physica Medica | VOL. 83

Nine novel ensemble models for solar radiation forecasting in Indian cities based on VMD and DWT integration with the machine and deep learning algorithms
Mahima Sivakumar ... Nallapaneni Manoj Kumar
Computers and Electrical Engineering | VOL. 108
Mahima Sivakumar, et. al.Mahima Sivakumar ... Nallapaneni Manoj Kumar
22 Mar 2023
Computers and Electrical Engineering | VOL. 108

A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance
Hongxia Lu ... Cyril Rakovski
BMC Medical Research Methodology | VOL. 22
Hongxia Lu, et. al.Hongxia Lu ... Cyril Rakovski
02 Jul 2022
BMC Medical Research Methodology | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Text to speech using Mel-Spectrogram with deep learning algorithms

Abstract

Talk to us

Similar Papers

More From: Periodicals of Engineering and Natural Sciences (PEN)