Nearest Neighbor-Based Indonesian G2P Conversion

Suyanto Suyanto,Agus Harjoko

doi:10.12928/telkomnika.v12i2.57

Nearest Neighbor-Based Indonesian G2P Conversion

Suyanto Suyanto, Agus Harjoko

Open Access

https://doi.org/10.12928/telkomnika.v12i2.57

Copy DOI

Journal: TELKOMNIKA (Telecommunication Computing Electronics and Control)	Publication Date: Jun 1, 2014
Citations: 19	License type: cc-by-sa

Affiliation: Telkom University, ADA University, Universitas Gadjah Mada

#Important Module #Speech Synthesis + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Grapheme-to-phoneme conversion (G2P), also known as letter-to-sound conversion, is an important module in both speech synthesis and speech recognition. The methods of G2P give varying accuracies for different languages although they are designed to be language independent. This paper discusses a new model based on pseudo nearest neighbor rule (PNNR) for Indonesian G2P. In this model, partial orthogonal binary code for graphemes, contextual weighting, and neighborhood weighting are introduced. Testing to 9,604 unseen words shows that the model parameters are easy to be tuned to reach high accuracy. Testing to 123 sentences containing homographs shows that the model could disambiguate homographs if it uses long graphemic context. Compare to information gain tree, PNNR gives slightly higher phoneme error rate, but it could disambiguate homographs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: TELKOMNIKA (Telecommunication Computing Electronics and Control)

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.