Abstract

Grapheme-To-Phoneme (G2P) converts an input letter sequence into its corresponding pronunciation. Several rule-based G2P models are available for Korean, but their performance in view of individual phonological rules in Korean is not well-investigated. We establish phonological rules to be reflected in a G2P model for Korean, and test the model performance of two most popular G2P models of Korean, i.e., g2pk and KoG2P, regarding their performance on each of these rules. We created a golden corpus to evaluate the performance of the current G2P models based on manual phonetic transcription. We then measured the performance of two G2P models, and identified the phonological rules in which the models show relative success or failure in deriving the correct output. We implemented additional phonological rules such as h-deletion, ui-variation, consonant place assimilation, and restructured the ordering of rules such as h/th-neutralization in the model. We show that our revised model makes a substantial improvement on model performance. Further, we argue that the major limitation of the current rule-based approaches to G2P is in its binary approach to phonological rules and lack of information about prosodic boundaries. We propose that a rule-based G2P system should reflect the stochastic nature of phonological processes informed by existing research on the gradient nature of phonological rule application as a function of factors such as lexical frequency and speech register.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.