Abstract

When newspapers and magazines translate proper nouns from foreign languages into Chinese, the Chinese translation (termed transliterations) they choose will typically be phonetically similar to the original word. With many different translators working without a common standard, there may be many Chinese transliterations for the same proper noun, such as using the same sounds but different Chinese characters, or even using different sounds and characters. This causes confusion for the reader and, more importantly, leads to incomplete Chinese web search results. This paper investigates the similarity comparison of transliterations as a first step toward solving the incomplete search problem. Our research framework had two stages: training and recognition. In the training stage, we compared Chinese speech sounds and constructed a database of similarity matrices. In the recognition stage, we first convert transliterations to phonetic notation and then apply the matrices from the database to calculate the degree of similarity among different transliterations. Highly similar transliterations are likely to be synonymous, referring to the same foreign proper noun. Our research results indicate that our methodology achieves better performance compared to traditional pinyin approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call