Abstract

This paper describes a study of Turkish-English cross language information retrieval (CLIR) system. One of the biggest issues with CLIR studies is to access to bi-lingual parallel corpus. So, the first step of this study was to construct a parallel Turkish-English corpus. We have constructed a corpus that has 1801 parallel documents. The corpus has been divided in to two parts, first one for training the system and second one for testing the system. Latent semantic indexing (LSI) techniques applied to the training set to obtain the language relations. After the training, we have performed set of tests (queries) to measure the effectiveness of LSI based retrieval on Turkish-English parallel corpus. Our experimental results show that, LSI based CLIR outperforms the non-LSI based retrieval where their retrieval successes are %69 and %26 respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.