• Corpus ID: 227746078

Distilling Knowledge from Reader to Retriever for Question Answering

@article{Izacard2021DistillingKF,
  title={Distilling Knowledge from Reader to Retriever for Question Answering},
  author={Gautier Izacard and Edouard Grave},
  journal={ArXiv},
  year={2021},
  volume={abs/2012.04584}
}
The task of information retrieval is an important component of many natural language processing systems, such as open domain question answering. While traditional methods were based on hand-crafted features, continuous representations based on neural networks recently obtained competitive results. A challenge of using such methods is to obtain supervised data to train the retriever model, corresponding to pairs of query and support documents. In this paper, we propose a technique to learn… 

Tables from this paper

Retrieval Data Augmentation Informed by Downstream Question Answering Performance
TLDR
This work identifies relevant passages based on whether they are useful for a trained QA model to arrive at the correct answers, and develops a search process guided by the QAmodel’s loss that generalizes better to the end QA task.
You Only Need One Model for Open-domain Question Answering
TLDR
This work proposes casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture and feeding the resulting computed representations to the reader, which leads to better gradient flow when the architecture is trained in an end-to-end manner.
Towards Unsupervised Dense Information Retrieval with Contrastive Learning
TLDR
This work explores the limits of contrastive learning as a way to train unsupervised dense retrievers, and shows that it leads to strong retrieval performance on the BEIR benchmark.
KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering
TLDR
This paper introduces a knowledge-enhanced version of Fusion-in-Decoder, namely KG-FiD, which uses a knowledge graph to establish the structural relationship among the retrieved passages, and a graph neural network to re-rank the passages and select only a top few for further processing.
End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering
TLDR
An end-to-end differentiable training method for retrieval-augmented open-domain question answering systems that combine information from multiple retrieved documents when generating answers and demonstrates the feasibility of learning to retrieve to improve answer generation without explicit supervision of retrieval decisions.
A Memory Efficient Baseline for Open Domain Question Answering
TLDR
This paper considers three strategies to reduce the index size of dense retriever-reader systems: dimension reduction, vector quantization and passage filtering, and shows that it is possible to get competitive systems using less than 6Gb of memory.
Dense Hierarchical Retrieval for Open-Domain Question Answering
TLDR
Dense Hierarchical Retrieval (DHR) is proposed, a hierarchical framework which can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage.
A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering
TLDR
A Transform-Retrieve-Generate framework (TRiG) framework is proposed, which can be plug-and-played with alternative image-to-text models and textual knowledge bases, and outperforms all state-of-the-art supervised methods by at least 11.1% absolute margin.
Hindsight: Posterior-guided training of retrievers for improved open-ended generation
TLDR
This work model the guide retriever after the posterior distribution Q of passages given the input and the target output and train it jointly with the standard retriever and the generator by maximizing the evidence lower bound (ELBo) in expectation over Q.
Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
TLDR
This work introduces a method to incorporate evidentiality of passages—whether a passage contains correct evidence to support the output— into training the generator, and introduces a multi-task learning framework to jointly generate the output and predict the evidentialsity of each passage.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
Relevance-guided Supervision for OpenQA with ColBERT
TLDR
This work proposes a weak supervision strategy that iteratively uses ColBERT to create its own training data, which greatly improves OpenQA retrieval on both Natural Questions and TriviaQA, and the resulting end-to-end Open QA system attains state-of-the-art performance on both of those datasets.
Latent Retrieval for Weakly Supervised Open Domain Question Answering
TLDR
It is shown for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system, and outperforming BM25 by up to 19 points in exact match.
REALM: Retrieval-Augmented Language Model Pre-Training
TLDR
The effectiveness of Retrieval-Augmented Language Model pre-training (REALM) is demonstrated by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA) and is found to outperform all previous methods by a significant margin, while also providing qualitative benefits such as interpretability and modularity.
Reading Wikipedia to Answer Open-Domain Questions
TLDR
This approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs, indicating that both modules are highly competitive with respect to existing counterparts.
Pre-training Tasks for Embedding-based Large-scale Retrieval
TLDR
It is shown that the key ingredient of learning a strong embedding-based Transformer model is the set of pre- training tasks, and with adequately designed paragraph-level pre-training tasks, the Transformer models can remarkably improve over the widely-used BM-25 as well as embedding models without Transformers.
Passage Re-ranking with BERT
TLDR
A simple re-implementation of BERT for query-based passage re-ranking on the TREC-CAR dataset and the top entry in the leaderboard of the MS MARCO passage retrieval task, outperforming the previous state of the art by 27% in MRR@10.
Neural Ranking Models with Weak Supervision
TLDR
This paper proposes to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources, and suggests that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.
SQuAD: 100,000+ Questions for Machine Comprehension of Text
TLDR
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).
End-to-End Open-Domain Question Answering with BERTserini
TLDR
An end-to-end question answering system that integrates BERT with the open-source Anserini information retrieval toolkit is demonstrated, showing that fine-tuning pretrained Bert with SQuAD is sufficient to achieve high accuracy in identifying answer spans.
Is Retriever Merely an Approximator of Reader?
TLDR
This work makes a careful conjecture that the architectural constraint of the retriever, which has been originally intended for enabling approximate search, seems to also make the model more robust in large-scale search, and proposes to distill the reader into the retriver so that the retrivers absorbs the strength of the reader while keeping its own benefit.
...
1
2
3
4
...