Abstract
AbstractIn this study, we investigate information retrieval (IR) on Turkish texts using a large‐scale test collection that contains 408,305 documents and 72 ad hoc queries. We examine the effects of several stemming options and query‐document matching functions on retrieval performance. We show that a simple word truncation approach, a word truncation approach that uses language‐dependent corpus statistics, and an elaborate lemmatizer‐based stemmer provide similar retrieval effectiveness in Turkish IR. We investigate the effects of a range of search conditions on the retrieval performance; these include scalability issues, query and document length effects, and the use of stopword list in indexing.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of the American Society for Information Science and Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.