Bangla text normalization for text-to-speech synthesizer using machine learning algorithms

Md Rezaul Islam,Arif Ahmad,Mohammad Shahidur Rahman

doi:10.1016/j.jksuci.2023.101807

Abstract

Text normalization (TN) for text-to-speech (TTS) synthesizer is the transformation of non-standard words like times, ordinal numbers, equations, ranges, dates, etc. into standard words that have similarities with their pronunciations. An essential part of all TTS synthesizers is text normalization. Without text normalization, generated voice from the TTS synthesizer will be unintelligible. For the unsatisfactory performance of previous research, a text normalization method for the Bangla language is proposed in this paper. At first, we have produced a tokenized dataset with a semiotic class using regular expressions from a Bangla corpus. Then, each token has been trained using the XGBClassifier algorithm. After that, it identifies the semiotic class for each token in a new Bangla text corpus using the trained XGBClassifier model. Finally, it produces a normalized text for each token by calling the class function according to the predicted class. This text normalization method will help the Bangla TTS synthesizer in producing more intelligible voices. The token classification accuracy of this method is 99.997%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of King Saud University - Computer and Information Sciences	Publication Date: Oct 20, 2023
Citations: 2	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Bangla text normalization for text-to-speech synthesizer using machine learning algorithms

Abstract

Talk to us

Similar Papers

More From: Journal of King Saud University - Computer and Information Sciences

Lead the way for us

Similar Papers

Text Normalization and Its Role in Speech Synthesis
Pooja Manisha Rahate ... Manoj Chandak
International Journal of Engineering and Advanced Technology | VOL. 8
Pooja Manisha Rahate, et. al.Pooja Manisha Rahate ... Manoj Chandak
14 Sep 2019
International Journal of Engineering and Advanced Technology | VOL. 8

Twitter Data Analysis and Text Normalization in Collecting Standard Word
Arif Ridho Lubis ... Mahyuddin K M Nasution
Journal of Applied Engineering and Technological Science (JAETS) | VOL. 4
Arif Ridho Lubis, et. al.Arif Ridho Lubis ... Mahyuddin K M Nasution
05 Jun 2023
Journal of Applied Engineering and Technological Science (JAETS) | VOL. 4

Text normalisation in text-to-speech Synthesis for South African Languages: Native number expansion
Georg I Schlunz ... Nkosikhona Dlamini
-
Georg I Schlunz, et. al.Georg I Schlunz ... Nkosikhona Dlamini
01 Nov 2017
01 Nov 2017

An unsupervised lexical normalization for Roman Hindi and Urdu sentiment analysis
Khawar Mehmood ... Muhammad Kamran Malik
Information Processing & Management | VOL. 57
Khawar Mehmood, et. al.Khawar Mehmood ... Muhammad Kamran Malik
16 Sep 2020
Information Processing & Management | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bangla text normalization for text-to-speech synthesizer using machine learning algorithms

Abstract

Talk to us

Similar Papers

More From: Journal of King Saud University - Computer and Information Sciences