Focal elements of neural information retrieval models. An outlook through a reproducibility study

Stefano Marchesin,Alberto Purpura,Gianmaria Silvello

doi:10.1016/j.ipm.2019.102109

Stefano Marchesin, Alberto Purpura + Show 1 more

Open Access

https://doi.org/10.1016/j.ipm.2019.102109

Copy DOI

Journal: Information Processing & Management	Publication Date: Sep 13, 2019
Citations: 16	License type: cc-by-nc-nd

Affiliation: University of Padua

Abstract

This paper analyzes two state-of-the-art Neural Information Retrieval (NeuIR) models: the Deep Relevance Matching Model (DRMM) and the Neural Vector Space Model (NVSM). Our contributions include: (i) a reproducibility study of two state-of-the-art supervised and unsupervised NeuIR models, where we present the issues we encountered during their reproducibility; (ii) a performance comparison with other lexical, semantic and state-of-the-art models, showing that traditional lexical models are still highly competitive with DRMM and NVSM; (iii) an application of DRMM and NVSM on collections from heterogeneous search domains and in different languages, which helped us to analyze the cases where DRMM and NVSM can be recommended; (iv) an evaluation of the impact of varying word embedding models on DRMM, showing how relevance-based representations generally outperform semantic-based ones; (v) a topic-by-topic evaluation of the selected NeuIR approaches, comparing their performance to the well-known BM25 lexical model, where we perform an in-depth analysis of the different cases where DRMM and NVSM outperform the BM25 model or fail to do so. We run an extensive experimental evaluation to check if the improvements of NeuIR models, if any, over the selected baselines are statistically significant.

Full Text