Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P

Suyanto ,Agus Harjoko,Sri Hartati

doi:10.14569/ijacsa.2016.070358

Abstract

A grapheme-to-phoneme conversion (G2P) is very important in both speech recognition and synthesis. The existing Indonesian G2P based on pseudo nearest neighbour rule (PNNR) has two drawbacks: the grapheme encoding does not adapt all Indonesian phonemic rules and the PNNR should select a best phoneme from all possible conversions even though they can be filtered by some phonemic rules. In this paper, a modified partial orthogonal binary grapheme encoding and a phonemic-based rule are proposed to improve the performance of PNNR-based Indonesian G2P. Evaluating on 5-fold cross-validation, contain 40K words to develop the model and 10K words to evaluation each, shows that both proposed concepts reduce the relative phoneme error rate (PER) by 13.07%. A more detail analysis shows the most errors are from grapheme ?e? that can be dynamically converted into either /E/ or /??/ since four prefixes, ’ber’, ’me’, ’per’, and ’ter’, produce many ambiguous conversions with basic words and also from some similar compound words with both different pronunciations for the grapheme ?e?. A stemming procedure can be applied to reduce those errors.

Highlights

A phonemization or letter-to-sound conversion, more commonly known as grapheme-to-phoneme conversion (G2P), is an important module in both speech recognition and speech synthesis
The phonemic rule filters some potential conversions to be selected by pseudo nearest neighbour rule (PNNR), for instance the first grapheme ⟨a⟩ followed by ⟨b⟩ in the given grapheme sequence ⟨abai⟩ is possible to be converted into either /A/ or /A+P/
This paper will discuss how to use PNNR to develop the Indonesian G2P, the proposed modified partial orthogonal binary grapheme encoding and the phonemic rule-based phoneme filtering, the experimental results showing the performance of both proposed concepts, and the conclusion

Summary

INTRODUCTION

A phonemization or letter-to-sound conversion, more commonly known as grapheme-to-phoneme conversion (G2P), is an important module in both speech recognition and speech synthesis. A G2P is developed using machine learning-based methods, such as instance-based learning [1], table lookup with defaults [1], self-learning techniques [2], hidden Markov model [3], morphology and phoneme history [4], joint multigram models [5], conditional random fields [6], Kullback-Leibler divergence-based hidden Markov model [7] These methods are commonly very complex and designed to be language independent, but they give varying performances for some phonemically complex languages, such as English, Dutch, French, and Germany. It not possible to be pronounced as /aU/ if it is not followed by ⟨u⟩ nor ⟨w⟩ Such phonemic rules can be used to filter possible conversions so that PNNR can convert a grapheme into a correct phoneme more accurately and faster. The phonemic rule filters some potential conversions to be selected by PNNR, for instance the first grapheme ⟨a⟩ followed by ⟨b⟩ in the given grapheme sequence ⟨abai⟩ is possible to be converted into either /A/ or /A+P/. The PNNR decides the best conversion of each given grapheme into the possible phonemes

Data Preprocessing

Modified Grapheme Encoding

Phonemic Rule-based Phoneme Filtering

Pseudo Nearest Neighbour Rule

Optimum Parameters

EXPERIMENTAL RESULTS

Modified Grapheme Encoding and Phonemic Rule

CONCLUSION

Most Errors

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2016
Citations: 14	License type: cc-by

R Discovery Prime

R Discovery Prime

Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Stemmer and phonotactic rules to improve n-gram tagger-based indonesian phonemicization
Suyanto Suyanto ... Warih Maharani
Journal of King Saud University - Computer and Information Sciences | VOL. 34
Suyanto Suyanto, et. al.Suyanto Suyanto ... Warih Maharani
14 Jan 2021
Journal of King Saud University - Computer and Information Sciences | VOL. 34

Transfer Learning for End-to-End ASR to Deal with Low-Resource Problem in Persian Language
Maryam Asadolahzade Kermanshahi ... Babak Nasersharif
-
Maryam Asadolahzade Kermanshahi, et. al.Maryam Asadolahzade Kermanshahi ... Babak Nasersharif
03 Mar 2021
03 Mar 2021

Transformer Based Grapheme-to-Phoneme Conversion
Sevinj Yolchuyeva ... Géza Németh
-
Sevinj Yolchuyeva, et. al.Sevinj Yolchuyeva ... Géza Németh
15 Sep 2019
15 Sep 2019

Causal analysis of Speech Recognition failure in adverse environments
Guojun Zhou ... Sangita Sharma
-
Guojun Zhou, et. al.Guojun Zhou ... Sangita Sharma
01 May 2002
01 May 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications