Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering

  title={Designing a Minimal Retrieve-and-Read System for Open-Domain Question Answering},
  author={Sohee Yang and Minjoon Seo},
In open-domain question answering (QA), retrieve-and-read mechanism has the inherent benefit of interpretability and the easiness of adding, removing, or editing knowledge compared to the parametric approaches of closed-book QA models. However, it is also known to suffer from its large storage footprint due to its document corpus and index. Here, we discuss several orthogonal strategies to drastically reduce the footprint of a retrieve-and-read open-domain QA system by up to 160x. Our results… 

Figures and Tables from this paper

A Survey for Efficient Open Domain Question Answering

This paper walks through the ODQA models and concludes the core techniques on efficiency, and Quantitative analysis on memory cost, processing speed, accuracy and overall comparison are given.

Bridging the Training-Inference Gap for Dense Phrase Retrieval

This work proposes an efficient way of validating dense retrievers using a small subset of the entire corpus to validate various training strategies including unifying contrastive loss terms and using hard negatives for phrase retrieval, which largely reduces the training-inference discrepancy.

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder

A Conditional Autoencoder (ConAE) is proposed to compress the high-dimensional embeddings of dense retrieval to maintain the same embedding distribution and better recover the ranking features.

Boosted Dense Retriever

DrBoost is a dense retrieval ensemble inspired by boosting, which produces representations which are 4x more compact, while delivering comparable retrieval results, reducing latency and bandwidth needs by another 4x.

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

The motivation and organization of the competition is described, the best submissions are reviewed, and system predictions are analyzed to inform a discussion of evaluation for open-domain QA.



PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them

It is found that PAQ preempts and caches test questions, enabling RePAQ to match the accuracy of recent retrieve-and-read models, whilst being significantly faster, and a new QA-pair retriever, RePAZ, is introduced to complement PAQ.

How Much Knowledge Can You Pack into the Parameters of a Language Model?

It is shown that this approach scales surprisingly well with model size and outperforms models that explicitly look up knowledge on the open-domain variants of Natural Questions and WebQuestions.

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

Is Retriever Merely an Approximator of Reader?

This work makes a careful conjecture that the architectural constraint of the retriever, which has been originally intended for enabling approximate search, seems to also make the model more robust in large-scale search, and proposes to distill the reader into the retriver so that the retrivers absorbs the strength of the reader while keeping its own benefit.

Pruning the Index Contents for Memory Efficient Open-Domain QA

This work presents a simple approach for pruning the contents of a massive index such that the open-domain QA system altogether with index, OS, and library components fits into 6GiB docker image while retaining only 8% of original index contents and losing only 3% EM accuracy1.

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

The motivation and organization of the competition is described, the best submissions are reviewed, and system predictions are analyzed to inform a discussion of evaluation for open-domain QA.

A Memory Efficient Baseline for Open Domain Question Answering

This paper considers three strategies to reduce the index size of dense retriever-reader systems: dimension reduction, vector quantization and passage filtering, and shows that it is possible to get competitive systems using less than 6Gb of memory.

Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

Interestingly, it is observed that the performance of this method significantly improves when increasing the number of retrieved passages, evidence that sequence-to-sequence models offers a flexible framework to efficiently aggregate and combine evidence from multiple passages.

Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval

Approximate nearest neighbor Negative Contrastive Estimation (ANCE) is presented, a training mechanism that constructs negatives from an Approximate Nearest Neighbor (ANN) index of the corpus, which is parallelly updated with the learning process to select more realistic negative training instances.