Abstract

Underlying many of the probabilistic models for information retrieval are assumptions of stochastic dependence or independence of varying degrees of severity for the index terms describing the documents. These models generally specify a matching function, that is a function which compares a query with each document. The form of that function is to a large extent determined by the particular dependence/independence assumption. For example, if the index terms are assumed to be independently distributed over both the set of relevant and non-relevant documents then the matching function will in general be linear, whereas an assumption of dependence will lead to a non-linear function.Irrespective of the form that the matching function may take it is always assumed that the search terms in the query are known. In this paper I wish to address the problem of the choice of search terms and how this choice may be affected by an independence assumption.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.