Abstract
Natural Language Processing (NLP) techniques have been explored to enhance the performance of Information Retrieval (IR) methods with varied results. Most efforts in using NLP techniques have been to identify better index terms for representing documents. This use in the indexing phase of IR has implicit effect on retrieval performance. However, the explicit use of NLP techniques during the retrieval or information seeking phase has been restricted to interactive or dialogue systems. Recent advances in IR are based on using Statistical Language Models (SLM) to represent documents and ranking them based on their model generating a given user query. This paper presents a novel method for using NLP techniques on user queries, specifically, a syntactic parse of a query, in the statistical language modeling approach to IR. In the proposed method, named Concept Language Models, a query is viewed as a sequence of concepts and a concept as a sequence terms. The paper presents different approximations to estimate the concept and term probabilities and compute the query likelihood estimate for documents. Some empirical results on TREC test collections comparing Concept Language Models with smoothed N-gram language models are presented.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.