Questions Are All You Need to Train a Dense Passage Retriever

  author={Devendra Singh Sachan and Mike Lewis and Dani Yogatama and Luke Zettlemoyer and Jo{\"e}lle Pineau and Manzil Zaheer},
We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is a cen-tral challenge for open-domain tasks, such as Open QA, where state-of-the-art methods typically require large supervised datasets with custom hard-negative mining and denoising of positive examples. ART, in con-trast, only requires access to unpaired in-puts and outputs (e.g. questions and poten-tial answer passages). It uses a… 

