Abstract

Transliteration plays a very significant role in machine translation, which has many applications such as cross-lingual information retrieval, communication, question-answering etc. The main objective of this research paper is to provide a method for transliteration of named entities from English to Hindi language. The proposed method consists of two modules, both of which apply phoneme-based approach to transliterate named entities. For transliteration, Module-I utilizes CMU Pronouncing dictionary, which is a collection of 133270 words along with their pronunciation. If the word to be transliterated is not found in CMU Pronouncing dictionary, Module-II is used. Module-II is based on 5-gram model, in which a maximum of five letters (two left, two right and one target letter) are used to generate transliterated target letter. The system has been tested on a database of 2408 North-Indian names. Google Input tool for Windows has been used for comparative study of the proposed transliteration system. The word accuracy of the transliteration system has been found to be 70.22% against 58.73% of Google Input tool.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.