Reproduce. Generalize. Extend. On Information Retrieval Evaluation without Relevance Judgments

Kevin Roitero,Stefano Mizzaro,Marco Passon,Giuseppe Serra

doi:10.1145/3241064

Abstract

The evaluation of retrieval effectiveness by means of test collections is a commonly used methodology in the information retrieval field. Some researchers have addressed the quite fascinating research question of whether it is possible to evaluate effectiveness completely automatically, without human relevance assessments. Since human relevance assessment is one of the main costs of building a test collection, both in human time and money resources, this rather ambitious goal would have a practical impact. In this article, we reproduce the main results on evaluating information retrieval systems without relevance judgments; furthermore, we generalize such previous work to analyze the effect of test collections, evaluation metrics, and pool depth. We also expand the idea to semi-automatic evaluation and estimation of topic difficulty. Our results show that (i) previous work is overall reproducible, although some specific results are not; (ii) collection, metric, and pool depth impact the automatic evaluation of systems, which is anyway accurate in several cases; (iii) semi-automatic evaluation is an effective methodology; and (iv) automatic evaluation can (to some extent) be used to predict topic difficulty.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reproduce. Generalize. Extend. On Information Retrieval Evaluation without Relevance Judgments

Abstract

Talk to us

Similar Papers

More From: Journal of Data and Information Quality

Lead the way for us

Journal: Journal of Data and Information Quality	Publication Date: Sep 29, 2018
Citations: 3

Similar Papers

Pooling-based continuous evaluation of information retrieval systems
Alberto Tonon ... Gianluca Demartini
Information Retrieval Journal | VOL. 18
Alberto Tonon, et. al.Alberto Tonon ... Gianluca Demartini
08 Sep 2015
Information Retrieval Journal | VOL. 18

Social Informatics and Information Retrieval Systems
Xiaoya Tang
Bulletin of the American Society for Information Science and Technology | VOL. 26
Xiaoya TangXiaoya Tang
01 Feb 2000
Bulletin of the American Society for Information Science and Technology | VOL. 26

Presentation Ordering Effects On Assessor Agreement
Tadele T Damessie ... Falk Scholer
-
Tadele T Damessie, et. al.Tadele T Damessie ... Falk Scholer
17 Oct 2018
17 Oct 2018

Generation of High-Quality Relevant Judgments through Document Similarity and Document Pooling for the Evaluation of Information Retrieval Systems
Minnu Helen Joseph ... Sri Devi Ravana
-
Minnu Helen Joseph, et. al.Minnu Helen Joseph ... Sri Devi Ravana
02 Dec 2022
02 Dec 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reproduce. Generalize. Extend. On Information Retrieval Evaluation without Relevance Judgments

Abstract

Talk to us

Similar Papers

More From: Journal of Data and Information Quality