Hindi CLIR in thirty days

Leah S Larkey,Nasreen Abduljaleel,Margaret E Connell

doi:10.1145/974740.974746

Hindi CLIR in thirty days

Leah S Larkey, Nasreen Abduljaleel + Show 1 more

https://doi.org/10.1145/974740.974746

Copy DOI

Journal: ACM Transactions on Asian Language Information Processing	Publication Date: Jun 1, 2003
Citations: 61

Affiliation: University of Massachusetts Amherst

#English Resources #Hindi Text + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

As participants in the TIDES Surprise language exercise, researchers at the University of Massachusetts helped collect Hindi--English resources and developed a cross-language information retrieval system. Components included normalization, stop-word removal, transliteration, structured query translation, and language modeling using a probabilistic dictionary derived from a parallel corpus. Existing technology was successfully applied to Hindi. The biggest stumbling blocks were collection of parallel English and Hindi text and dealing with numerous proprietary encodings.

Full Text