• Corpus ID: 245144556

Large Dual Encoders Are Generalizable Retrievers

@article{Ni2021LargeDE,
  title={Large Dual Encoders Are Generalizable Retrievers},
  author={Jianmo Ni and Chen Qu and Jing Lu and Zhuyun Dai and Gustavo Hern'andez 'Abrego and Ji Ma and Vincent Zhao and Yi Luan and Keith B. Hall and Ming-Wei Chang and Yinfei Yang},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.07899}
}
It has been shown that dual encoders trained on one domain often fail to generalize to other domains for retrieval tasks. One widespread belief is that the bottleneck layer of a dual encoder, where the final score is simply a dotproduct between a query vector and a passage vector, is too limited to make dual encoders an effective retrieval model for out-ofdomain generalization. In this paper, we challenge this belief by scaling up the size of the dual encoder model while keeping the bottleneck… 

Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval

TLDR
This work demonstrates that MLM pre-trained transformers can be used to effectively encode text information into a single-vector for dense retrieval.

DPTDR: Deep Prompt Tuning for Dense Passage Retrieval

TLDR
The proposed method, called DPTDR, outperforms previous state-of-the-art models on both MS-MARCO and Natural Questions and is compatible with any pre-trained language model and retrieval task.

UnifieR: A Unified Retriever for Large-Scale Retrieval

TLDR
A new learning framework, Uni R, is proposed, which combines dense-vector and lexicon-based retrieval in one model with a dual-representing capability, and experiments on passage retrieval benchmarks verify its effectiveness in both paradigms.

ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval

TLDR
This paper introduces a self on-the-fly distillation method that can effectively distill late interaction and in-corporates a cascade distillation process to further improve the performance with a cross-encoder teacher.

Multi-Task Retrieval-Augmented Text Generation with Relevance Sampling

TLDR
A simple yet effective approach to clean the training set by utilizing a distinct property of knowledge-intensive generation: The connection of query-answer pairs to items in the knowledge base, which scales well with increased model capacity and achieves state-of-the-art results in seven KILT tasks.

RetroMAE: Pre-training Retrieval-oriented Transformers via Masked Auto-Encoder

TLDR
A novel pre-training framework for dense retrieval based on the Masked Auto-Encoder, known as RetroMAE is proposed, which notably outperforms the existing pre-trained models on a wide range of dense retrieval benchmarks, like MS MARCO, Open-domain Question Answering, and BEIR.

Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?

TLDR
The Salient Phrase Aware Retriever (SPAR) is introduced, a dense retriever with the lexical matching capacity of a sparse model and sets a new state of the art for dense and sparse retrievers and can match or exceed the performance of more complicated densesparse hybrid systems.

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

TLDR
Through an extensive analysis, it is shown that the prompt tuning strategy can mit-igate the two issues—parameter-inefficiency and weak generalizability—faced by fine-tuning based retrieval methods and can improve the out-of-domain zero-shot generalization of the retrieval models.

Masked Autoencoders As The Unified Learners For Pre-Trained Sentence Representation

TLDR
The recently proposed MAE style pre-training strategy, RetroMAE, is extended, such that it may effectively support a wide variety of sentence representation tasks and is proved to be effective for zero-shot retrieval.

A Thorough Examination on Zero-shot Dense Retrieval

TLDR
This paper presents the first thorough examination of the zero-shot capability of DR models, and discusses the effect of several key factors related to source training set, analyze the potential bias from the target dataset, and review and compare existing zero- shot DR models.

References

SHOWING 1-10 OF 32 REFERENCES

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

TLDR
This work extensively analyzes different retrieval models and provides several suggestions that it believes may be useful for future work, finding that performing well consistently across all datasets is challenging.

Zero-shot Neural Retrieval via Domain-targeted Synthetic Query Generation

TLDR
An approach to zero-shot learning (Xianet al., 2018) for ad-hoc retrieval models that relies on synthetic query generation is proposed, which allows us to create arbitrarily large, yet noisy, query-document relevance pairs that are domain targeted.

RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering

TLDR
This work proposes an optimized training approach, called RocketQA, to improving dense passage retrieval, which significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions and demonstrates that the performance of end-to-end QA can be improved based on theRocketQA retriever.

Sparse, Dense, and Attentional Representations for Text Retrieval

TLDR
A simple neural model is proposed that combines the efficiency of dual encoders with some of the expressiveness of more costly attentional architectures, and is explored to explore sparse-dense hybrids to capitalize on the precision of sparse retrieval.

Dense Passage Retrieval for Open-Domain Question Answering

TLDR
This work shows that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework.

Attention is All you Need

TLDR
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

Multilingual Universal Sentence Encoder for Semantic Retrieval

TLDR
On transfer learning tasks, the multilingual embeddings approach, and in some cases exceed, the performance of English only sentence embedDings.

Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation

TLDR
Empirically, it is shown that this is an effective strategy for building neural passage retrieval models in the absence of large training corpora and depending on the domain, this technique can even approach the accuracy of supervised models.

From doc2query to docTTTTTquery

TLDR
The setup in this work follows doc2query, but with T5 as the expansion model, and it is found that the top-k sampling decoder produces more effective queries than beam search.

Language-agnostic BERT Sentence Embedding

TLDR
It is shown that introducing a pre-trained multilingual language model dramatically reduces the amount of parallel training data required to achieve good performance by 80%, and a model that achieves 83.7% bi-text retrieval accuracy over 112 languages on Tatoeba is released.