Axiomatic Analysis and Optimization of Information Retrieval Models

Chengxiang Zhai,Hui Fang

doi:10.1145/2499178.2499205

Chengxiang Zhai, Hui Fang

PDF Available

https://doi.org/10.1145/2499178.2499205

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The accuracy of a search engine is mostly determined by the optimality of the retrieval model used in the search engine. Develoing optimal retrieval models has always been a very important fundamental research problem in information retrieval because an improved general retrieval model would enable all search engines to be more useful, thus have immediate broad impact. Extensive research has been done on developing an optimal retrieval model since 1960s, leading to multiple effective retrieval models, including, e.g., Pivoted Normalization Vector Space model, BM25, Dirichlet Prior Query Likelihood, and PL2. However, these state of the art retrieval models were all developed at least a decade ago, suggesting that it has been difficult to further improve them. One reason why we could not easily improve these models is because we do not have a good understanding of their deficiencies and have mostly relied on empirical evaluation to assess the superiority of a retrieval model.Recently, an axiomatic way of analyzing and optimizing retrieval models has been developed and shown great promise in both understanding the deficiencies of retrieval models and developing more effective ones. The basic idea of this axiomatic framework is to specify a number of formal constraints that an optimal retrieval model is expected to satisfy, and use them to assess the optimality of a retrieval model. Such an axiomatic way of modeling relevance provides a theoretical way to study how to develop an ultimately optimal retrieval model, enables analytical comparison of different retrieval models without necessarily requiring empirical evaluation, and has led to the development of multiple more effective retrieval models.The purpose of this tutorial is to systematically explain this emerging axiomatic approach to developing optimal retrieval models, review and summarize the research progress achieved so far on this topic, and discuss promising future research directions in optimizing general retrieval models. Tutorial attendees can expect to learn, among others, (1) the basic methodology of axiomatic analysis and optimization of retrieval models, (2) how to formalize retrieval heuristics with mathematical constraints, (3) the major retrieval constraints proposed so far, (4) the new retrieval functions derived by using the axiomatic approaches, (5) specific research directions to further develop more effective retrieval models, and (6) general open challenges in developing an ultimately optimal retrieval model. The tutorial should appeal to those who work on information retrieval models and those who are interested in applying axiomatic analysis to optimize specific retrieval functions in real applications. The tutorial should also be interesting to researchers who work on ranking problems in general. Attendees will be assumed to know the basic concepts in information retrieval models.

Full Text