Relevance feedback for building pooled test collections

David Otero,Álvaro Barreiro,Javier Parapar

doi:10.1177/01655515231171085

Abstract

Offline evaluation of information retrieval systems depends on test collections. These datasets provide the researchers with a corpus of documents, topics and relevance judgements indicating which documents are relevant for each topic. Gathering the latter is costly, requiring human assessors to judge the documents. Therefore, experts usually judge only a portion of the corpus. The most common approach for selecting that subset is pooling. By intelligently choosing which documents to assess, it is possible to optimise the number of positive labels for a given budget. For this reason, much work has focused on developing techniques to better select which documents from the corpus merit human assessments. In this article, we propose using relevance feedback to prioritise the documents when building new pooled test collections. We explore several state-of-the-art statistical feedback methods for prioritising the documents the algorithm presents to the assessors. A thorough comparison on eight Text Retrieval Conference (TREC) datasets against strong baselines shows that, among other results, our proposals improve in retrieving relevant documents with lower assessment effort than other state-of-the-art adjudicating methods without harming the reliability, fairness and reusability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Relevance feedback for building pooled test collections

Abstract

Talk to us

Similar Papers

More From: Journal of Information Science

Lead the way for us

Similar Papers

The Text REtrieval Conferences (TRECs): Providing a Test‐Bed for Information Retrieval Systems
Donna Harman
Bulletin of the American Society for Information Science and Technology | VOL. 24
Donna HarmanDonna Harman
01 Apr 1998
Bulletin of the American Society for Information Science and Technology | VOL. 24

The TREC experiments and their impact on Europe
Alan F Smeaton ... Donna Harman
Journal of Information Science | VOL. 23
Alan F Smeaton, et. al.Alan F Smeaton ... Donna Harman
01 Apr 1997
Journal of Information Science | VOL. 23

Generation of High-Quality Relevant Judgments through Document Similarity and Document Pooling for the Evaluation of Information Retrieval Systems
Minnu Helen Joseph ... Sri Devi Ravana
-
Minnu Helen Joseph, et. al.Minnu Helen Joseph ... Sri Devi Ravana
02 Dec 2022
02 Dec 2022

Development of a text search engine for medicinal chemistry patents
Emilie Pasche ... Fatma Oezdemir-Zaech
EMBnet.journal | VOL. 18
Emilie Pasche, et. al.Emilie Pasche ... Fatma Oezdemir-Zaech
09 Nov 2012
EMBnet.journal | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Relevance feedback for building pooled test collections

Abstract

Talk to us

Similar Papers

More From: Journal of Information Science