• Publications
  • Influence
Dense Passage Retrieval for Open-Domain Question Answering
TLDR
This work shows that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. Expand
Language Models as Knowledge Bases?
TLDR
An in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models finds that BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge. Expand
MLQA: Evaluating Cross-lingual Extractive Question Answering
TLDR
This work presents MLQA, a multi-way aligned extractive QA evaluation benchmark intended to spur research in this area, and evaluates state-of-the-art cross-lingual models and machine-translation-based baselines onMLQA. Expand
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
TLDR
A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline. Expand
Interpretation of Natural Language Rules in Conversational Machine Reading
TLDR
This paper formalise this task and develops a crowd-sourcing strategy to collect 37k task instances based on real-world rules and crowd-generated questions and scenarios to assess its difficulty by evaluating the performance of rule-based and machine-learning baselines. Expand
Unsupervised Question Answering by Cloze Translation
TLDR
It is found that modern QA models can learn to answer human questions surprisingly well using only synthetic training data, and is demonstrated that, without using the SQuAD training data at all, this approach achieves 56.4 F1 on SQuad v1. Expand
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval
TLDR
This work proposes a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions, which achieves state-of-the-art performance on twoMulti-hop datasets, HotpotQA and multi-evidence FEVER, and can be applied to any unstructured text corpus. Expand
Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets
TLDR
A detailed study of the test sets of three popular open-domain benchmark datasets finds that 30% of test-set questions have a near-duplicate paraphrase in their corresponding train sets, and that simple nearest-neighbor models outperform a BART closed-book QA model. Expand
KILT: a Benchmark for Knowledge Intensive Language Tasks
TLDR
It is found that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text. Expand
How Context Affects Language Models' Factual Predictions
TLDR
This paper reports that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline. Expand
...
1
2
...