SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval

Tiancheng Zhao,Xiaopeng Lu,Kyusong Lee

doi:10.18653/v1/2021.naacl-main.47

Abstract

We introduce SPARTA, a novel neural retrieval method that shows great promise in performance, generalization, and interpretability for open-domain question answering. Unlike many neural ranking methods that use dense vector nearest neighbor search, SPARTA learns a sparse representation that can be efficiently implemented as an Inverted Index. The resulting representation enables scalable neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. We validated our approaches on 4 open-domain question answering (OpenQA) tasks and 11 retrieval question answering (ReQA) tasks. SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, CMRC and etc. Analysis also confirms that the proposed method creates human interpretable representation and allows flexible control over the trade-off between performance and efficiency.

Highlights

Introduction attention inBERT (Devlin et al, 2018) to simple inner product interaction for dialog response retrieval.Open-domain Question Answering (OpenQA) is our key research goal is to develop new the task of answering a question based on a a method that can simultaneously achieve expresknowledge source
Unlike existing work based-on dual-encoders, we focus on learning sparse representation and em
SPARTA trained only on SQuAD outperforms the baselines, achieving 54.1% gain compared to BM25, 26.7% gain compared to USE-Question answering (QA) and 25.3% gain compared to Poly-Encoders in terms of average Mean Reciprocal Rank (MRR) across 11 different datasets

Summary

Related Work

The classical approach for OpenQA depends on knowledge bases (KB)s that are manually or automatically curated, e.g., Freebase KB (Bollacker et al, 2008), NELL (Fader et al, 2014) etc. Dr.QA uses a search en- 3 Proposed Method gine to filter to relevant documents and applies machine readers to extract the final answer (Chen 3.1 Problem Formulation et al, 2017) It needs two stages because all ex- First, we formally define the problem of answer isting machine readers, for example, BERT-based ranking for question answering. Typical passage-level retrieval systems sets the a to be the passage and leaves c kens to ei, and a contextualized transformer model to encode the answer and obtain contextualized token-level embedding sj: empty (Chen et al, 2017; Yang et al, 2019a). Resentation so that they can be pre-computed Token-level Interaction SPARTA scoring uses at indexing time. Since it is an offline opera- token-level interaction between the query and tion, we can use the most powerful model for the answer. Parameters are optimized using back propagation (BP) through the neural network

Indexing and Inference

Learning to Rank

In-domain Performance

Model Analysis

Out-of-domain Generalization

Conclusion

Findings

A Appendices